Data pre-processing techniques generally refer to the addition, deletion, or transformation of training set data.

Although this text is primarily concerned with modeling techniques, data preparation can make or break a model’s predictive ability. Different models have different sensitivities to the type of predictors in the model; how the predictors enter the model is also important. Transformations of the data to reduce the impact of data skewness or outliers can lead to significant improvements in performance. Feature extraction is one empirical technique for creating surrogate variables that are combinations of multiple predictors.

Additionally, simpler strategies such as removing predictors based on their lack of information content can also be effective.

--

--

Data from the Transit Cost Project (TCP) allows us to ask questions like why trasnit-infrastructure projects in New York city cost 20 times more on a per kilometer basis than in Seoul. The data contains hundreds of transit projects from around the world that spans more than 50 countires and totals more than 11,000 km of urban rail build since the late 1990s. TCP in this case is interesting in figuring out how to deliver more high-capacity transit projects for a fraction of the cots in countries like the United States.

--

--

I recently read an article by Andrew Stewart lecturer at Johns Hopkins Engineering where he describes open source tools available for enterprise level data management through the entire data lifecycle. It turns out, there are a select few candidates that achieve data warehouse, modeling, staging, and storage at little to no cost.

--

--

As digital transformation has accelerated, the e-commerce landscape has become increasingly dynamic. New players have emerged at the same time that established actors have taken on new roles; some barriers to e-commerce at the firm, individual and country levels have been overcome, while new barriers have emerged. New business models have transformed buyer-seller relationships and pushed out the frontier of what is possible to buy and sell online. More firms are buying and selling online than ever before, including across borders, and the absolute value of the e-commerce market is growing. This is true across industries, including in traditionally consumer-facing sectors.

--

--