TensorFlow 2.0 — Feature Engineering Complexity continues

sandeep srinivasan
Oct 8 · 4 min read

Very low levels of abstraction can inhibit and delay innovation

The first time I looked at TensorFlow, it reminded me of assembly language or perhaps Fortran!
Even though these languages are powerful, due to their low abstraction level, they are also extremely tedious to program and accomplish even small tasks. This lack of abstraction can pose many challenges and barriers to rapid innovation.

The level of abstraction matters

If researchers and developers today were to write FFT (Fast Fourier Transforms) algorithms in Fortran and not use a software package such as Matlab , Mathematica or OctaveDSP chips (digital signal processors) would not be able to translate even simple words, let alone, paragraphs and music. In addition, perhaps the DSP chip, that is present in every mobile device (+Alexa , Google home etc.), would cost $5000 and not $1.50 cents .

One of the key elements that allowed the rapid innovation in the area of DSP’s over the last two decades, was the level of abstraction that Matlab, Mathematica and other similar software packages provided to perform complex computations such as an FFT, with simply one line of code:

Y = fft(X)

TensorFlow 2.0

TensorFlow, while powerful with all the great algorithms embedded needs to be augmented with additional layers of abstraction to scale…

Even though TensorFlow 2.0 is a significant leap forward in raising the abstraction level using the Keras API’s, it still does not sufficiently reduce the complexities of Feature Engineering to build accurate Machine Learning Models. It is thus very hard to use for Non-Machine Learning domain experts and other professionals, which makes one wonder if the non ML experts are simply, ‘not the target audience’ for TensorFlow 2.0.

Automatic Feature Engineering v/s TensorFlow 2.0 Linear Model on the Titanic Dataset

TensorFlow 2.0 describes a linear classifier example of the Titanic dataset. The goal is to build a linear classifier to predict if a person survived the journey on the Titanic or not, based on a set of input features . The dataset contains categorical and numerical features. To deal with this in TensorFlow2.0, there is a lot of code that needs to be written, to build a simple linear classifier model. This also requires a deep understanding of Machine Learning models, Feature Engineering and TensorFlow.

Titanic Dataset
Feature Engineering on the Titanic Dataset using TensorFlow 2.0

VERIFAI Machine Learning Platform: Automatic Feature Engineering

VerifAI’s Automatic Feature Engineering is a set of algorithms that transform the input data into a form (numerical vectors) that the Machine Learning Algorithms can understand. Our Automatic Feature Engineering hides the complexity and details of data transformation required before feeding the data into a DNN (Deep Neural Network) or other machine learning models.

For instance the categorical feature columns in the Titanic dataset such as {‘fare’ , ‘sex’, ‘n_siblings_spouses’, ‘class’ , ‘deck’ ,’ embark_town’ and ‘alone’ } are automatically encoded into numerical values a machine learning model can understand, using a feature hasher, factorizer, one hot encoding or other encoding algorithms . The model’s accuracy is dependent on how the inputs are encoded and interpreted by the machine learning algorithms, thus it is important to map, encode, transform and combine the features accurately to input into a ML model.

ML engineers spend a lot of time mapping features into usable inputs for models. Our Automatic Feature Engineering (AutoMapper) is an important step towards simplifying & democratizing ML, making it available to all.

Automatic mapping and selection of Features that are usable by a DNN and other ML models
Building a Classifier for the Titanic Dataset using VerifAI Machine Learning Platform — 4 lines of code

The VerifAI Machine Learning Platform allows developers to build Classifiers, Regressors and Reinforcement Learning Algorithms with just a few lines of code or no code at all.

VerifAI AutoMapper Produces Feature Analysis plots
Feature Importance Plot produced by the VerifAI AutoMapper

To stay informed about the VerifAI-ML platform please  type in your email and click the ‘Stay Informed’ 

WordPress database error: [Unknown column 'wp_comments.comment_ID' in 'field list']
SELECT SQL_CALC_FOUND_ROWS wp_comments.comment_ID FROM wp_comments WHERE ( comment_approved = '1' ) AND comment_post_ID = 536 ORDER BY wp_comments.comment_date_gmt ASC, wp_comments.comment_ID ASC

Comments are closed.