SUBSCRIBE NOW
IN THIS ISSUE
PIPELINE RESOURCES

Big Data and Machine Learning


Machine Learning (ML) is being used and will be used where prediction needs to be carried out and financial performance is at stake.

Machine learning is also used to carry out churn prediction of bank customers or telecom customers. Churn prediction helps the bank or telecom operator to come up with various plans for retaining their high-end customers.

Researchers in computer security use ML to detect anomalous behavior in streaming data. ML techniques are also used to ascertain energy usage patterns and match them with real-time demand response in the fields of Green Computing and Smart Energy. In a common use case, customers provide their energy requirements and the system applies ML to come up with the most economical plan for energy consumption and utilization.

Computational biologists are also using ML for time-series data models to unravel the mysteries of human genome. In the biochemical field, ML is used to discover new drugs which otherwise would have taken hundreds of costly experiments to discover. ML is used in the financial domain to predict stock prices. In general ML is being used and will be used where prediction needs to be carried out and financial performance is at stake.

Big Data challenges for Machine Learning:

Big data, as we’ve covered, includes combined data from various sources. As an operator, I might have mobile usage data of customers in various geographical locations along with their social media data. This has created a framework for predicting behaviors that was not possible in the past, and increased the demands created by the volume and variety of data.

To gain full advantage from the data, it has to be analyzed in real time. For this, tools that have the capacity to predict and leverage the use of prediction must be in place. Machine learning tools are a solution, but the technology is nascent. Although there are a few machine learning solutions available, there is a quite a bit of room for improvement.

The three characteristics of Big Data throw challenges for the machine learning field. For handling the high volume of the data, the machine learning models need to be scalable. In order to accommodate the high velocity of the data, the models need to work in real time for fast decision making. Finally, to manage the high variety of data, solutions require integration across disciplines with sound understanding of the domains and the ability to give meaning to the data.

Distributed Machine Learning

Big data brings along with it a huge size, and in order to process this data, the number crunching is distributed on various machines. This creates the concept of distributed machine learning: the model will learn part of the data and then an aggregated model can be built out of the sub models.  Distributed machine learning will lead to development of more asynchronous algorithms, which means different processing elements can move forward independently without the need for synchronization. The final aim will be to reduce the time for building the model, enabling the machine learning results to be more accurate and fast. This will deliver models in the future that build solutions to data as it streams in, thus addressing volume and velocity.

Dynamic offerings

Due to the size of the Big Data, it will become very important to decide on the number of features required for the model in order to accommodate the speed and cost of computation. Researchers have discovered that the machine learning curve stabilizes after a certain point with an increase in data. Therefore, new algorithms that will select the most appropriate features are being developed.

In some fields, especially speech recognition, a new accent or a new cell phone will become new variables which cannot be discarded from the model learning. The system needs to learn from this data to help the customer in the future. In case of on-line advertising, throwing away data can lead to wrong results. In other words: the more the data the more accurate the model. The ability to quickly model and manage these new variables will enable more fully featured, dynamic offerings.



FEATURED SPONSOR:

Latest Updates





Subscribe to our YouTube Channel