Planet Big Data logo

Planet Big Data is an aggregator of blogs about big data, Hadoop, and related topics. We include posts by bloggers worldwide. Email us to have your blog included.


August 16, 2018

Revolution Analytics

Make R speak

Every wanted to make R talk to you? Now you can, with the mscstts package by John Muschelli. It provides an interface to the Microsoft Cognitive Services Text-to-Speech API (hence the name) in Azure,...


August 15, 2018

Jean Francois Puget

TrackML Challenge



I just won a gold medal for my 9th rank in the TrackML challenge hosted on Kaggle.  That challenge was proposed by CERN (Centre Européen de Recherche Nucléaire).  The problem was to reconstruct the trajectories of high energy physics particle from the tracks they leave in the detectors used at CERN.  The data we were given was actually a simulation of forthcoming detectors at CERN.  Here is how the challenge is described by CERN:

To explore what our universe is made of, scientists at CERN are colliding protons, essentially recreating mini big bangs, and meticulously observing these collisions with intricate silicon detectors.

While orchestrating the collisions and observations is already a massive scientific accomplishment, analyzing the enormous amounts of data produced from the experiments is becoming an overwhelming challenge.

Event rates have already reached hundreds of millions of collisions per second, meaning physicists must sift through tens of petabytes of data per year. And, as the resolution of detectors improve, ever better software is needed for real-time pre-processing and filtering of the most promising events, producing even more data.

To help address this problem, a team of Machine Learning experts and physics scientists working at CERN (the world largest high energy physics laboratory), has partnered with Kaggle and prestigious sponsors to answer the question: can machine learning assist high energy physics in discovering and characterizing new particles?

Specifically, in this competition, you’re challenged to build an algorithm that quickly reconstructs particle tracks from 3D points left in the silicon detectors.

I used an unsupervised machine learning technique known as clustering, see my detailed writeup.  The key was to preprocess data so that the clustering algorithm could find the particle tracks more easily.  This is very similar to feature engineering for supervised machine learning.  The main issue was the computation ressources required by my solution: about 20 cpu hours per event, and we need to predict tracks for 125 events.  This would not have been possible without the use of parallelism on multicore machines.  I used an IBM Power 9 machine with 40 cores in order to be able to compute a submission in less than 3 days.

Many participants also had significant running times, for instance the one who finished second says that some events take him up to 3 cpu days.  This makes the achievement of the first place winner extremely impressive.  Not only do they achieve an amazing detection rate, but their code runs in 8 minutes per event!  I'm not in the same league as them.

Anyway, I'm still happy with my result.  And the cherry on the cake: this gold medal earned me the  Kaggle Competitions Grandmaster title :) I'm the second Kaggler to become a Grandmaster in two categories.








August 14, 2018

Revolution Analytics

Microsoft R Open 3.5.1 now available

Microsoft R Open 3.5.1 has been released, combining the latest R language engine with multi-processor performance and tools for managing R packages reproducibly. You can download Microsoft R Open...


August 13, 2018

Ronald van Loon

How Data Innovators Lead Data Transformation Forward

Data transformation is currently upon us, and we can’t deny the fact that the future we talked about is now here. The world is finally progressing towards a more mature period of digital competence, and data innovators sit at the center of it all. Data-driven companies have an important role to play in ensuring innovation and making sure that customers get the services they require.

I recently had the experience of watching Oracle Modern Business Experience Videos by Oracle CEO Mark Hurd and Vice President, Cloud and Technology, John Abel. Both explained the importance of data innovation today, and shared their recent experiences. They also described how companies across the globe are changing their ecosystems by innovating with data transformation.

The Importance of Being a Data Innovator

As we mentioned above, being a data innovator is growing in importance. With many organizations jumping onto the data bandwagon, innovators make your company stand out when it comes to your performance with data. Not only do these forward thinkers ensure that your organization is able to make a clear footing, but they also try to live up to the short-term performance expectation that customers have from them. These expectations can form a major burden on the head of CEOs up front, who often fall prey to them and end up being fired.

A recent research study has mentioned how many CEOs find it tough going at the start of their tenure. These CEOs go through a troublesome stage up front, which they can only overcome through innovation. CEOs that fail to manage this are likely to experience numerous problems and often end up faltering underneath them. Research states that 40 percent of all new CEOs don’t last more than 18 months in their new positions. The pressure and expectations are high, and they struggle to deliver satisfactory performance.

Big organizations have been on the front line of these changing times, and the data backs that up. Almost 50 percent of the Fortune 500 firms from a decade or so ago are not there anymore. These companies either don’t exist, or have failed to keep up with the times. This shows us that having great potential and a captive market is not everything.

Moving to the hybrid cloud was an innovation, and organizations that were able to move on from legacy systems have benefitted from it. However, organizations that didn’t move to the cloud have had to live with reduced operational efficiencies, among many other problems.


AI, or Artificial Intelligence, nowadays plays an important role in data integration and enabling you to provide a service where you give your customer the attention and the personalization that they deserve. Many customers come with specific demands, and the best way to satisfy them is giving them the personalization they want.

AI helps in doing this, as it allows marketers and point-of-contact agents to reach out to every customer on a personal level. Customers often call in to talk about issues, and agents are tasked with collecting the history of the client to make that conversation even more personalized and informative. Without AI, this can be a troublesome process that involves numerous implications.

However, with AI, you can provide much more change and help. Organizations that use AI as a means of reaching out to clients, and making conversations more personalized than ever before, benefit by having data immediately available. When you give customers the level of personalization they need, you can actually use that moment to create the chemistry you want. For that one moment, you can make the client yours, thanks to precise personalization.

Not only does AI make customer management easier, but it also makes infrastructure management efficient. Artificial Intelligence includes Machine Learning, which helps to make infrastructure more manageable.

Data-Driven Companies

Many product-driven companies didn’t experience the drastic growth they would have wanted. Companies such as Domino’s Pizza took a long time to increase their market worth to over a billion dollars. This is because such companies had to do this without the benefits of data-driven innovation. They had to use traditional methods for growth, which took some time to reap rewards.

By contrast, companies that have been driven by data have experienced rapid growth. Jet.Com is one company that reached the market value of $1 Billion in almost no time. It took around 4 months to cross the figure, due to their effective model that’s focused on data growth and feasibility. Jet.Com used innovative data from airline companies to give interested consumers the best source of data related to price and other services. Companies focused on data are able to achieve this growth on the basis of innovative offerings. They give innovative data sources to customers, which eventually propagates their cause.

Innovators in the Future

Innovators of the future will come across fewer of the barriers that we see today, with the emergence of technology. Innovation starts with both something new and useful. Innovators focus on discovering data sources that fit their cause. These data sources are extremely important, because whatever analysis you form will be based on how clean the data is. And, the data source you choose actually defines that efficiency.

Once they have found the data needed for their cause, they learn from it. Learning from the data available to you requires putting analysis tools in place, and getting output that is as relevant as possible.

With information from the data collected, these innovators of the future will find out what’s unknown, and will make the unknown known through the use of their superior data sources. Companies that exploit data well are not only growing fast, but are also seen as the ones currently doing the best on the stock market.

Role of Autonomous Cloud

The autonomous cloud takes away many costs and issues that data innovators had to go through previously. It manages affairs in an autonomous manner and helps give data innovators the luxury to create cloud storage and leave it there for future use.

The service provides a self-managing and self-securing database in the cloud, which helps minimize the workload hassle of database management. With data visualization methods, the autonomous database cloud can help you spot trends that might otherwise have gone unnoticed.

The future holds a lot of unforeseen implications and opportunities for data innovators. They are now required to stand tall in the face of adversity and make the unknown known. As they strive to do so, they will unlock numerous achievements that will mark the data trends in the years to come.

If you want to learn more about the innovators of the future, you can watch videos of Oracle’s Modern Business Experience here.


Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

More Posts - Website

Follow Me:

Author information

Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

The post How Data Innovators Lead Data Transformation Forward appeared first on Ronald van Loons.


Tutorial: Building predictive models in AdvancedMiner

In this tutorial you will learn how to build a predictive model using AdvancedMiner in a few simple steps. First of all, go to the free AdvancedMiner download page.

How to build a predictive model with AdvancedMiner?

Create an alias – all databases are stored there.

Building predictive models in AdvancedMiner - 1

Select the folder where you want to save it.

Building predictive models in AdvancedMiner - 2

Import the database from your hard drive.

Building predictive models in AdvancedMiner - 3 Building predictive models in AdvancedMiner - 4  

Examples of using Freq and context scripts

View the data. In this example the Class is the attribute that will be predicted. In addition to displaying the data, you can perform many other operations related to your database, e.g. use the Freq for data analysis. Freq calculates statistics for the analysed variables, which enables to obtain important information about the attributes. It also allows to create visualizations of variable distributions, to build various histograms, and to study the relationship between the variables and the target, or to group the value of a categorical variable.

Another interesting feature is context scripts that allow you to divide and sort tables. Context scripts also have more applications, but if you need even more advanced functions you can create and add your own scripts to your liking.

Building predictive models in AdvancedMiner - 5

Gython Script – creating Python scripts

In order to predict success or failure easily, you can change target into a numeric value by simply modifying the database. Right-click on the project, select “New” and then “Gython Script”.

Building predictive models in AdvancedMiner - 6

Gython Script allows you to create scripts in Gython, which is the language used for Python-based data processing. Numerous operations and modifications on the databases enable the use of SQL queries in Gython, which is possible after entering “sql:” or “sql in `alias_name`”.

In this language, the ability to combine Python and SQL commands is highly relevant, therefore it is possible to use different functionalities of both languages at the same time (e.g. loop SQL commands). In this example, we will use SQL to make changes.

Building predictive models in AdvancedMiner - 7

A column with values of “1” and “0” was created. This is our new target.

Building predictive models in AdvancedMiner - 8

Creating Workflow

Next, create a new Workflow.

Building predictive models in AdvancedMiner - 9

Building predictive models in AdvancedMiner - 10

Using drag&drop method to place your database in the middle of the Workflow. Repeat the process with the Split Table. To connect them, drag the anchor from your database into the Split Table anchor.

Although Workflow is easy to use, it allows to perform more complex operations such as creating diagrams, technical and analytical transformations, modeling, generating code or adding your code written in Gython Script.

Building predictive models in AdvancedMiner - 11

Building predictive models in AdvancedMiner - 12

To select attributes, use the Attribute Usage tile.

Building predictive models in AdvancedMiner - 13

Edit the tile and select which attribute is the target.

Building predictive models in AdvancedMiner - 14

Set unnecessary attributes as inactive.

Building predictive models in AdvancedMiner - 15

Press the Execute button or F6.

Building predictive models in AdvancedMiner - 16

Application of the tree method

Use the tree method to create a model. Connect it to Attribute Usage and the database. Set the positive value of the target to “1” (or “0”, depending on what you want to predict) in the algorithm settings.

In addition to the tree method, there are also methods such as linear and logistic regression. Each method contains a number of unique settings, which can be freely customized.

Building predictive models in AdvancedMiner - 17

Building predictive models in AdvancedMiner - 18

To see how effective your prediction is, select Classification Test Result and connect it to the test database and model. Edit and select the target and its positive value.

Building predictive models in AdvancedMiner - 19

Building predictive models in AdvancedMiner - 20

The Classification Test Result allows you to analyze the results. We can see how the ROC curve looks, and calculate the area under the curve, see the Lift curve, or interpret the confusion matrix.

Building predictive models in AdvancedMiner - 21

To predict the results using your model, attach to Workflow the database which you want to predict. Next, attach the Scoring tile, which must be connected to the model.

Building predictive models in AdvancedMiner - 22

Select ‘View Mining Apply Task’, then add the item in ‘New Columns’.

Building predictive models in AdvancedMiner - 23

Building predictive models in AdvancedMiner - 24

Follow the steps below:

Building predictive models in AdvancedMiner - 25

Building predictive models in AdvancedMiner - 26

Building predictive models in AdvancedMiner - 27

Building predictive models in AdvancedMiner - 28

Press the Execute button or F6.

Building predictive models in AdvancedMiner - 29

A column with the probability that the value “1” will appear was added.

Building predictive models in AdvancedMiner - 30

Select the minimum probability from which the value will be positive.

Building predictive models in AdvancedMiner - 31

And check for which items there is a “1” and for which there is a “0”.

Building predictive models in AdvancedMiner - 32

Artykuł Tutorial: Building predictive models in AdvancedMiner pochodzi z serwisu Algolytics.


August 10, 2018

Revolution Analytics

Because it's Friday: A data center under the sea

Microsoft's Project Natick is investigating the feasibility of housing data centers (like those that power Azure) in submerged pressure vessels, with the goal of putting computational services near...


Revolution Analytics

Redmonk Language Rankings, June 2018

The latest Redmonk Language Rankings are out. These rankings are published twice a year, and the top three positions in the June 2018 rankings remain unchanged from last time: JavaScript at #1, Java...


August 09, 2018


August 08, 2018

Revolution Analytics

In case you missed it: July 2018 roundup

In case you missed them, here are some articles from July of particular interest to R users. A program to validate quality and security for R packages: the Linux Foundation's CII Best Practices Badge...

Ronald van Loon

Telecom: Customer Experience is The Key to Success

It can without a semblance of doubt be said that the digital transformation is currently over us. Gone are the days when companies wanted to prepare for it, and go into the era with sufficient preparation. With the transformation now over us, it is time for action, and the need to be at the top of their game is felt more than it ever was for most companies.

The fact that digital transformation is here is especially true for operators offering services in the telecommunication space across the globe. These operators are in for a challenging time, and if they want to come out as victors, rather than victims of the transformation, they would have to make major adjustments on how they interact with their clients and customers. Customer engagement and meeting customer expectations sits at the center of all digital transformations, and since telecom operators directly cater to their customers, they have to revolutionize their services and be a part of this move towards the future. The complications operators are going through can be gauged by the fact that their business is shrinking, voice and text messages are barely as common as they used to be, and customers have started moving towards newer forms of communication. Operators could either treat digitization as a chance to become involved in the move towards the future, or they could bottle up all the years of hard work and fall prey to digitization.

Using digitization is, of course, easier said than done, which is why operators would have to base their efforts on one defining factor here: customer experience. Customer expectations are changing, and by realizing the importance of these changing customer expectations and then giving customers what they want out of the digitization process, operators can work towards treating this complication as an opportunity.

The Need for Customer Experience Management

A customer with demands presents an opportunity for all telecom operators. To put it clearly, in this world of digital transformations, the customer of today is aware of what they want and how they want it. Thus, it is important for operators to recognize the specific and custom needs of all users, and cater to these needs for enhanced results.

Naturally, the first question in this process is to identify what their customer wants. Obviously, all customer needs might not be the same. While one customer might be looking for over 10 GBs of Internet in a month, with some talk time, another customer will be looking for 20 GBs per month. The roadmap here would be to first identify the needs of all customers and then realize how they want these needs to be met. Once operators know of these changing customer demands, they can then convert their customer’s changing intent into a purchase that is beneficial for them. This is different from what operators had been doing before. Instead of making a one size fits all package, operators would now have to offer tailor-made services that go down well with the changing expectations and fit the ever-increasing demands that customers now have from operators. In simpler words, operators would now have to understand what every customer wants, and tailor their services according to these demands.

This is where customer experience management (CXM) comes into the picture. Having talked about the need for finding out what every customer expects and the experience customers want, we would now talk about the importance of customer experience management and how it can help operators tailor their services and customize them according to customer preferences. Giving an exemplary service is now not just related to customer service, but also involves many other facets. The customer experience in today’s digital world extends to all parts of the user’s touch point. This would include all of the following:

  • When a customer interacts with a representative of the service provider while making or receiving calls. This conversation is now an important touch point, and operators need to ensure that they meet all customer expectations here. There is not enough room for error here, and all customers need to be fulfilled at all counts.
  • When a customer places a complaint regarding the quality of the service or any other issue. This is again a make or break scenario, and the service provider should be wary of the expectations regarding the desired customer experience here. The customer expects nothing short of the best experience, and operators should ensure that. A disgruntled customer needs to be given special care if possible.
  • The exchange of words while a new product or service is being purchased by a customer should also be considered as a touch point. Operators could present information regarding new products and services here and can use this touch point to assure the new customer of their services.

Until very recently, telecom operators have been interchanging experience with quality. Due to this confusion, they have been limiting down the opportunities they have for providing the right experience to their customers. The quality of the network is not the only factor dictating experience, and operators need to ensure many other things as well, alongside maintaining the quality of the services and network that they are providing. Customers have been found to be receptive towards a well-rounded, better customer experience, as a poll by Defaqto Research found that 55 percent of consumers were willing to pay more for an operator that provided a better customer experience. Moreover, further research in this regard has found out that companies that are willing to excel at providing a fulfilling customer experience tend to grow at 4-8% more than the market.

With little doubt over the fact that customer expectations are rising, customer expectation management should be a top priority for all telecom operators. Telecom operators have to ensure a seamless and unique service for customers and in that context, a customer experience management program is the need of the hour.

Role of Big Data

With all the talk about meeting the experience that customers want and fulfilling their expectations, the question about how to extract these customer expectations comes into the picture. To that end, operators need to invest in big data and have analysis tools in place to find actionable insight. They would have to monitor the data usage of different customers and offer plans exclusively catered to their specific needs. Telecom operators would have to use analysis tools such as Machine Learning to extract sense out of the data they gather.


Going into the future, it is expected that there would be a flurry of activity in the global CXM space. Not only telecom operators, but managers from different industries would start focusing on providing marketing and customer-centered activities that fulfill the expectations that customers tend to have from their service. Customer experience management is something that we can see roll out in the sales and distribution space as well, during the upcoming future. The digital transformation is going to become even more widespread, and CXM will play an even bigger role in the future.

MahindraComviva is specialized in Customer Experience Management for Telecom service providers. You can read the original post here to gather more information about the digital transformation, and the changing role of telecom operators.


Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

More Posts - Website

Follow Me:

Author information

Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

The post Telecom: Customer Experience is The Key to Success appeared first on Ronald van Loons.


SoftBase Announces a Free Download of the TestBase SQLCODE -805 Tool

SoftBase announces a free download of the TestBase SQLCODE -805 Tool which helps identify the source of Db2 SQLCODE -805 on z/OS.  Please see to...

Big Data University

IBM Partners with to launch Professional Certificate Programs

IBM has partnered with, the leading online learning destination founded by Harvard and MIT, for the delivery of several Professional Certificate programs. Professional Certificate programs are a series of in-demand courses designed to build or advance critical skills for a specific career.

We are honored to welcome IBM as an edX partner,” said Anant Agarwal, edX CEO and MIT Professor. “IBM is defined by its commitment to constant innovation and its culture of lifelong learning, and edX is delighted to be working together to further this shared commitment. We are pleased to offer these Professional Certificate programs in Deep Learning and Chatbots to help our learners gain the knowledge needed to advance in these incredibly in-demand fields. Professional Certificate programs, like these two new offerings on edX, deliver career-relevant education in a flexible, affordable way, by focusing on the skills industry leaders and successful professionals are seeking today. is a great partner for us too, not just because they have an audience of over 17 million students, but because their mission of increasing access to high-quality education for everyone so closely aligns with our own.

Today we’re seeing a transformational shift in society. Driven by innovations like AI, cloud computing, blockchain and data analytics, industries from cybersecurity to healthcare to agriculture are being revolutionized. These innovations are creating new jobs but also changing existing ones—and require new skills that our workforce must be equipped with. We are therefore taking our responsibility by partnering with edX to make verified certificate programs available through their platform that will enable society to embrace and develop the skills most in-demand” said IBM Chief Learning Officer Gordon Fuller.

The IBM Skills Network (of which Cognitive Class is part of) also relies on Open edX — the open source platform that powers — and we plan to contribute back the enhancements as well as support the development of this MOOC project. To learn more about how we use (and scale Open edX) check out our [recent post] on the topic.

We are kicking off this collaboration with two Professional Certificate programs that might be of interest to you.

Deep Learning (the first course in the program, Deep Learning Fundamentals with Keras, is open for enrollment today starts September 16)
Building Chatbots Powered by AI (the first course in the program, How to Build Chatbots and Make Money , is open for enrollment today and already running)

The chatbot program includes three courses:

1. How to Build Chatbots and Make Money;
2. Smarter Chatbots with Node-RED and Watson AI;
3. Programming Chatbots with Watson Services.

Those of you who are familiar with my chatbot course on Cognitive Class, will recognize the first course on the list. The key difference is that this version on edX includes a module on making money from chatbots.

Node-RED is a really cool visual programming environment based on JavaScript. With little programming skills and the help of this second course, you’ll be able to increase your chatbot’s capabilities and make it interact with other services and tools, including sentiment analysis, speech to text, social media services, and deployment on Facebook Messenger.

The last course in this chatbot program focuses on other Watson services, specifically the powerful combination of Watson Assistant and Watson Discovery to create smarter chatbots that can draw answers from your existing knowledge base.

All in all, this program is still accessible to people with limited programming skills; though, you will get the most out of it if you are a programmer.

The Deep Learning program is aimed at professionals and students interested in machine learning and data science. Once completed, it will include five courses:

1. Deep Learning Fundamentals with Keras;
2. Deep Learning with Python and PyTorch;
3. Deep Learning with Tensorflow;
4. Using GPUs to Scale and Speed-up Deep Learning;
5. Applied Deep Learning Capstone Project.

The goal of these programs is to get you ready to use exciting new technologies in the emerging fields of Data Science, Machine Learning, AI, and more. The skills you’ll acquire through these highly practical programs will help you advance your career, whether at your current job or when seeking new employment.

It’s a competitive market out there, and we are confident that these programs will serve you well. If you are an employer looking to re-skill your workforce, these programs are also an ideal way to do so in a structured manner.

The certificates also look quite good on a resume (or LinkedIn) as passing these courses and completing the programs demonstrates a substantial understanding of the topics at hand. This isn’t just theory. You can’t complete these Professional Certificate programs without getting your hands dirty, so to speak.

We also plan to launch more Professional Certificates in collaboration with, but if you have an interest in advancing your career in Data Science and AI, we recommend that you start with these two.

The post IBM Partners with to launch Professional Certificate Programs appeared first on Cognitive Class.


August 07, 2018

Revolution Analytics

IEEE Language Rankings 2018

Python retains its top spot in the fifth annual IEEE Spectrum top programming language rankings, and also gains a designation as an "embedded language". Data science language R remains the only...


August 03, 2018

Revolution Analytics

Because it's Friday: Undangerous Australia?

This comedy sketch from the New York Times imagines a possible new angle for a commercial from the Australian Tourist Commission. (NSFW language in the punchline at the very end.) I've had a...


Revolution Analytics

Video: How to run R and Python in SQL Server from a Jupyter notebook

Did you know that you can run R and Python code within a SQL Server instance? Not only does this give you a powerful server to process your data science calculations, but it makes things faster by...


August 01, 2018

Revolution Analytics

R Generation: 25 Years of R

The August 2018 issue of Significance Magazine includes a retrospective feature on the R language. (I suggest reading the PDF version, also available for free access.) The article by Nick Thieme...

Cloud Avenue Hadoop Tips

Upgrading from Ubuntu 16.04 (Xenial Xerus) to 18.04 (Bionic Beaver)

I had been using Ubuntu for quite a few years and lately had been using Ubuntu 16.04 along with Windows 10 as dual boot on my Lenovo Z510. Ubuntu for pretty much everything and Windows for any software which is not compatible with Ubuntu. This has been a deadly combination which worked for me pretty well.

Why the upgrade in Ubuntu?

In Ubuntu 16.04 pretty much everything was working well, except the suspend and hibernate. The system was not able to resume from suspend every time. The only option left was to shutdown and restart the computer along with all the applications, which is not really nice.

Checking the different Ubuntu forums and trying out different suggested solutions didn't fix the problem. So, finally decided to upgrade Ubuntu to the latest version. There is a probability that the upgrade process got messed up and the data is lost. My data is backed up automatically to the different Clouds, so this was not an issue.

Ubuntu released 18.04 in April, a few months back. But, upgrade process from 16.04 (Xenial Xerus) to 18.04 (Bionic Beaver) is not recommended. Upgradation to the point release 18.04.1 is the safest approach. It gives Canonical time to fix the bugs and make the transition smoother.

So, as soon as 18.04.1 announced, I took a shot and upgraded to Ubuntu 18.04.1 by following the instructions mentioned here.

How was the Ubuntu upgradation?

During the initial days of Ubuntu, upgradation from one version to another messed up the Operating System, but it was really smooth this time. Here I am with the latest Ubuntu after a reboot.

Ubuntu 18.04 (Bionic Beaver) Desktop

The download and installation process took about 2 hours with a good number of prompts in between. Wish there was a 'Yes to all' option during the process which would have made the installation process unattended.

Was everything smooth after the upgradation?

Usually any software upgrade will have some major/minor issues which will get fixed overtime, same is the case with Ubuntu. Here is a list of some issues to start with. I am sure to update the list the more I use the latest Ubuntu and also with the possible solutions if any.

  • Ubuntu was using Unity UI and moved to GNOME, so it takes some time to get used to the new UI. But, my initial impressions are good with GNOME.

  • I had been using Phatch to batch mark the images on this blog, but it has been removed from the Ubuntu repository. Quick Googling around gave Converseen as an alternative which I am yet to try.

  • Right click on the mouse stopped working and has been replaced with two-finger click. There were a couple of solutions and quick try of some of them didn't work. Again it will take some time to get used to the two-finger click.

  • The good thing is that suspend start working and I was able to resume where I stopped. This basically increased the productivity and the focus. When I used the Nvidia display driver instead of the default open source Nouveau display driver, the suspend functionality broke and I had to revert to the Nouveau display driver.

  • Should I upgrade?

    If Canonical is supporting the Ubuntu version which you had been using for the next few years and there is no hard pressing issue like suspend in my case then I would recommend to stick to the current OS. Again, if you want to try the latest technology like me, then go ahead with the upgrade.

    July 30, 2018

    Revolution Analytics

    A Certification for R Package Quality

    There are more than 12,000 packages for R available on CRAN, and many others available on Github and elsewhere. But how can you be sure that a given R package follows best development practices for...


    Simplified Analytics

    Digitization in Fitness Industry

    Remember as children we used to play more physical games & ground activities like football, cricket, and even some local games. As we grew we were more physical in our daily activities which had...

    Making Data Meaningful

    What is BIG Data?

    Often in the business world, things come along that drive change. Sometimes, the change could be subtle. Sometimes the change is dramatic. In a lot of cases, there are those that are early adopters, as well as those that one can count on to wait until the change is deemed mature by the industry experts. It is important also to put forth a value proposition to adopt the change; to change for change’s sake generally doesn’t help the bottom line. In some cases, this change crosses industries and technology, but sometimes the change is among clusters of business that gives one a competitive advantage over another.

    Consider some of the transformations that have happened over the past few decades:

            • The computer
            • The internet
            • Mobile technology
            • Laws (SOX, etc.)
            • Hybrid vehicles
            • Global economics
            • Social networks

    Companies that were able to see these changes coming were able to prepare. Companies that didn’t see it coming were forced to adapt. It is obvious to see that being prepared is arguably the best alternative, but in many cases and for many reasons, this may not be financially, socially, or logistically possible during the early adoption period. Eventually, over time, more and more companies adapt the change into the culture or technology as it becomes more and more pervasive and valuable. Those that take the lead and the initiative generally experience growing pains, but also gain advantages of working out the kinks of emerging technology or change. Those that wait may need to face consequences for the delay but prolong the disruption to critical process for as long as possible.

    So what would we say are some changes that are on the way if not already here that will help shape the next decade of business activity? One would argue that “Big Data” is a great candidate for the next major change we are beginning to feel the effects of as its existence begins to transform the business landscape.

    So what is “Big Data”? According to Gartner[1] Big Data is defined as:

    Big data is the term adopted by the market to describe extreme information management and processing issues which exceed the capability of traditional information technology along one or multiple dimensions to support the use of the information assets. Throughout 2010 and into 2011, big data has focused primarily on the volume issues of extremely large datasets generated from technology practices such as social media, operational technology, Internet logging and streaming sources. A wide array of hardware and software solutions has emerged to address the partial issue of volume. At this point, big data, or extreme information processing and management, is essentially a practice that presents new business opportunities.

    The definition leaves the reader with a lot questions. What is “extreme information management”? What are the “hardware and software solutions” that attempt to address the issues?  What are the “new business opportunities”? Let’s take a look some of these questions in more detail.

    Extreme Information Management

    When we look at the sheer volume or magnitude of some of the datasets that would make up a Big Data solution, it is clear the traditional processes of data management will be challenged. The normal ways of loading, storing, processing, evaluating, and analyzing this information has to change in order to reap the benefits inside the datasets. This means the normal way a technologist would write a query needs to change. The usual backup and recovery process has to be re-considered. In addition, how this “extreme” information is combined with the “traditional” information is the new conundrum businesses that are going down this path are facing. In a practical sense, a company that has a traditional data warehouse in place will be faced with the challenge of aligning existing dimensions with the new “dimensions” from the big data arena. If this challenge can be solved for an enterprise, then the competitive advantage discussed earlier can be realized because the power of leveraging changes. If a company can at least begin evaluating the contents of big data as it relates to their enterprise, they will still gain valuable insights and correlations that are eluding the competitors that will wait out the change until it is mature.

    Hardware and Software Solutions

    When a company is considering a big data solution, immediately the existing infrastructure stack will be impacted. There are a lot of vendors that are stepping up into the technology gap to help customers move forward with initiatives. In a lot of cases, storage and processing power will become immediate needs. Deep in the internals of big data solutions is the idea of multiple worker servers that take on distributed work to break down the processing into smaller chunks. This can be achieved with potentially low cost servers, but a lot of them depending on the scale desired. The open source community has been a leader in this arena of solutions that help address the big data problem with low cost entry into public offerings. For the technologist, this means learning new ways of addressing data access using environments like Hadoop ( or HPCC ( The traditional ETL tool vendors like Pentaho ( are also attempting to get in the game by extending their products to work in a big data environment. In the long run, we expect the vendors to all address “Big Data” in some form or fashion, suggesting the change is here to stay.  Therefore, it would be wise to do some research in the current vendor offering or begin to experiment with some of the solution options that are available, in order to begin to reap some of the benefits of a big data solution.

    New Business Opportunities

    When considering what new business opportunities will be presented as a result of engaging in a Big Data environment, the potential is alluring. When dealing with datasets in the terabyte and petabyte range, the scope and makeup of the results go from specifics to generalities in rapid order. It is currently feasible to look at things in a dimensional model like perspective. It is difficult and in some cases no longer possible to calculate results as you would normally do in traditional database environments because the sheer physics have changed. But there are opportunities for businesses to integrate this new landscape into their current Business Intelligence environments. Some companies are finding ways to use sources from public web sites to private applications to drive value and competitive advantage. The government is even seeing the opportunity and critical need to get involved[2]. With this much activity related to a single discipline, we would recommend actively seeking out the opportunities in your own business environment. More than likely, your competitors already are.

    What are the challenges?

    Some of the challenges will be related to identifying sources, because sometimes these are external to a business and require creative thought to see the big picture and the analytical opportunities. Additional challenges will be related to bringing a technical staff up to speed on new technology and new ways of thinking related to size and scale of Big Data environments. Furthermore, there will be changes to “normal” data acquisition and storage policies. In addition to all this, there still is the integration issue. This issue can be more complex for companies that already have a mature data warehouse environment, and wish to navigate from a specific dimensional model to a less specific Big Data analysis. The bridge between the two can be daunting, physically and philosophically.

    What does this mean for you?

    Although Big Data is still an emerging technology and concept, there is enough momentum in the marketplace that warrants serious consideration for customers and clients. There are still significant challenges in bringing it all together in a comprehensive Business Intelligence environment. The ability for systems to collect more and more data points going forward is inevitable, and systems that can evaluate the large data sets will become more and more ingrained in the normal technical environment. The traditional systems and big data systems will become more seamless over time. The more companies can leverage the information investment on all the data points as they come available or affects their business, the more effective they will become.

    If you would like to learn more about Big Data and what impact it may have in your environment or are just curious about how to approach the challenges in this new arena, we would love to help work through it with you.

    For more information on BIG DATA or to see how we can help your business with its BIG DATA needs, contact us.

    The post What is BIG Data? appeared first on Making Data Meaningful.

    Ronald van Loon

    The Future of Business for an Intelligent World

    The intelligent world is truly upon us, and based on the technological advancements that we are becoming used to, one cannot wait for the future to arrive. In this intelligent world, there is pressure on businesses to deliver an exceptional customer experience, and to use every technology they can to ensure that the customers get what they expect.

    I recently had the opportunity to attend the SAP Ariba Live event in Amsterdam. The event is one of the biggest supply chain and procurement conferences held across the globe. At the conference, I had the opportunity to hear and witness excellent use cases, and speak to the president of SAP Ariba, Barry Padgett.

    We are moving towards a change in the way businesses develop their ecosystem. Innovation is now an important part of the business ecosystem, and many organizations are doing all they can to ensure that they develop interesting and futuristic business processes. The procurement of material as you work is now being made possible, and the downtime involved in the process will diminish. Procurement and supply chain are affecting our business standards; organizations now need to jump on the bandwagon and develop a data-driven environment, where they use all latest tools of data analysis to reach actionable insights.

    There are numerous use cases and examples of how businesses are moving towards a better world for our future generations. Gone are the days when businesses would place selfish motives above anything else, and focus on them at all times. Now, businesses are concerned about resource management and achieving the best efficient and ethical standards within their operations.

    Some examples of this change in practice include:


    Shell, which is one of the biggest energy sector companies in the world, is looking at the market of the future. The company has not only made plans to provide low-carbon energy as the ecosystem across the world changes, but has also set the example of assisting leaders of the future. The world is progressing at a rapid pace, and it is the responsibility of individuals and entrepreneurs across the globe to ensure that they give nothing less than their best.

    Shell has shown interest in helping build a cleaner environment across the world, and using its influence to foster healthy growth across the world. As part of their operation, they research technical innovations to solve the energy challenge and find ways to produce greener energy. Digital technology is the tipping point for Shell here, as it helps their ambitions and goals to come together, creating a bright future for the generations to come.

    Amazon Alexa

    There are no doubts about the use of Amazon Alexa in crafting the world of the future. The AI- and IoT-powered device has created a revolutionary impact, and has led us to believe that the future is here. However, the use of Alexa in procurement and supply chain is something new. At the conference I witnessed amazing use cases of how Alexa can make supply chain operation and procurement easier for everyone involved.

    With Alexa’s unique attributes, supply chain managers can now use it to make orders and to secure ETAs on their supplies. Alexa gives managers and other stakeholders the option to give the device commands regarding any business needs. This is done through voice recognition, as Alexa recognizes the voice of the person who has the authority to place orders. Once an order is placed, Alexa processes it and gives an estimate of its cost, based on the quantity ordered. Finally, Alexa gives a detailed estimate of the time it will take for the order to be delivered. This device uses some of the most complicated and interesting concepts of AI to deliver data analysis.

    Imagine a production worker working on a production site. The worker sees the need to order more head safety gear, but since they are on site, it is difficult to process the order at that time— they would have to head over to the nearest computer, take off their gear and go through the catalogue. Alexa automates all of this and gives people a golden opportunity to get whatever they want, wherever they want it. Taking the example above, the production worker would now only need to use their voice to contact Alexa, order more head safety gear and get an estimate of its delivery time.

    Automating the Procurement Process

    There has been a need for automation in the procurement process for a long time. This automation has come through the IoT sensors that were presented at the SAP Ariba event. They collect data from across multiple sources, then not only do they alert you of any anomaly, but they instantly help you get an estimate of the damage,  as well as the cost of spare parts and repair to get the machinery up and running again.

    An example we can take the use case of windmills in Denmark. They serve as a source of energy production, and have sensors attached to measure their performance in real-time. There is no need for engineers to go to the field and monitor the windmills every day. The sensors also alert the management of possible breakdowns, the reasons behind the breakdown and what should be done to get the windmill up and running. So, if the reason behind the breakdown is poor resistance to the forces of weather, the management can improve efficiency moving down the line, by incorporating more efficient methods.

    President of SAP Ariba’s View on the Future of Procurement

    The president of SAP Ariba, Barry Padgett was extremely hopeful about the use of digital means to making procurement better. He mentioned the use of AI techniques, including ML and DL, and how organizations could make a complete ecosystem through the use of other technological advances such as the Blockchain. The Blockchain comes in handy in identifying the person you’re doing business with. It authenticates the process and then gives a go-ahead signal for the future.

    When asked about the trends that we can expect from the intelligent world in the future, Barry Padgett mentioned that he saw the growing use of digital technologies as a welcoming sign. He, however, said that the future wouldn’t be about these technologies themselves. As much as we talk about ML, AI, DL, the Blockchain and other technologies, he thinks the future is going to be about the advancements we make through these technologies. He mentioned, “I think we’ll hear less about all these technologies as we move on. We won’t have to talk about the technologies anymore, but will instead be talking about the cool things that organizations are doing with these technologies.”

    Click here to learn more about the world of the future.



    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post The Future of Business for an Intelligent World appeared first on Ronald van Loons.

    Cloud Avenue Hadoop Tips

    Compatibility between the Big Data vendors

    What the Big Data vendors have to offer?

    Finally that the Big Data wars have pretty much ended, we have got Cloudera, MapR and Hortonworks as the major Big Data vendors. There are also other pure vendors that focus on one or two Big Data softwares (like DataStax on Apache Cassandra), but the above mentioned Cloudera, MapR and Hortonworks vendors provide a complete suite of softwares covering storage, processing, security, easy installation etc. These vendors solve some of the problems like

    • Integrating the different softwares from Apache. Not every Big Data software from Apache is compatible with other. These vendors make sure that the different softwares from Apache play nice with each other.

    • Installation and fine tuning of the Big Data softwares is not easy. It's no more download and click. These vendors make the installation process easier and automate as much as possible.

    • Although the software from Apache is free to use. Apache Software Foundation doesn't provide any commercial support. Companies like Cloudera, MapR and Hortonworks fill the gap as long as the software from these vendors is being used.

    Read more »

    Tutorial: How to publish script in R as a Web Service?

    Imagine that you have built a predictive model with R and you would like to predict in real time, whether it is profitable to grant a loan and you wonder how to publish the script written in the R language as a Web Service.

    At first glance, this task seems to be complex, however, it turns out to be a piece of cake. With the help of predictive models it is possible to predict occurrence of specific events on the basis of historical data. Predictive models have various applications, yet in this example we will use a model that assesses the probability of loan repayment.

    How to deploy a model in R into Scoring.One?

    Scoring.One is a tool which enables quick implementation of predictive models, creating decision-making scenarios with extremely high efficiency (processing hundreds of thousands of queries for many predictive models at the same time).

    Sign up for free!

    In order to use a R model in Scoring.One you need a .zip file containing the:

    1. .rds file with the model
    2. .csv file with variables included in the model. You can create it on your own or by using “Start” button when creating scenario, and then export it to .csv file (do not forget to save regularly edited block!)
    3. .R file containing the variable with the result to be returned by the model; the variable must be named rResult

    You can download the ZIP file used in this tutorial here, and read Scoring.One documentation.

    Script in R as a Web Service 1

    The worst part’s over, and now is the time for sheer pleasure – deploying your model!

    Go to the “Scoring Code Management” tab and click on the “Upload new scoring code” button.

    Script in R as a Web Service 2

    Enter the name for your scoring code and drop the .zip file in the box or click to upload it.

    Script in R as a Web Service 3

    After saving the code it is time for a scenario. Choose “Scenario” tab and create new one. Drag&drop blocks to the center of the grey area. In this case (although probably like in all cases – the name says it all 🙂 ) we start with the “Start” block.

    You do not need to enter the variables, since they have been imported in .zip file. Choose “Scoring Code” and “End” buttons using drag&drop method and connect them in order.

    Click on the “Scoring Code” and select the previously uploaded code from which you also import the variables. Save and click “Deploy”. Then, the scenario will be processed and you can select in the “End” block the variables to be returned as a result of the scenario. Then click “Deploy” button once again.

    Script in R as a Web Service 4

    You can see that creating scenario is like a walk in the park!

    Let’s test your scenario: choose “Forms”, find your scenario and enter the data. As a result you will obtain scoring results saved as rResult variable. Now the model has been queried.

    Script in R as a Web Service 5

    You can add more variables, for example, in order to categorise on the basis of results. To add more variables go back to the scenario and add one more block.

    Script in R as a Web Service 6

    In “Expressions” you can add a code in R or Groovy. The variables created there will also be returned by the model.

    Script in R as a Web Service 7 Script in R as a Web Service 8

    If you add the variable “category”, you will get not only the scoring results, but also the category to which its value belongs.

    Script in R as a Web Service 9

    In the place where the URL should be entered, insert:


    How to find this mysterious Score Token?

    Script in R as a Web Service 10

    It is hidden in User Settings 🙂 .


    As Body of the query enter the variables needed for the model in the JSON form.

    Script in R as a Web Service 11

    Press “Send” button. And that’s all it takes. You have finished the task!

    Script in R as a Web Service 12

    Artykuł Tutorial: How to publish script in R as a Web Service? pochodzi z serwisu Algolytics.


    July 29, 2018

    Making Data Meaningful

    ETL Development

    ETL stands for Extract, Transform, Load. It is a business intelligence (BI) oriented process to load data from the source system to the target system to enable business reporting. It is used for migrating data from one database or platform to another, forming data marts and data warehouses and also converting databases from one format to another.

    The three ETL steps are:

    • (E) Extracts data from sources that are mostly heterogeneous or different types of systems such as relational databases, flat files, mainframe systems, xml, etc.;
    • (T) Transforms the data through cleansing, calculating, translating, filtering, aggregating, etc. into a structure that is more appropriate for reporting and analysis;
    • (L) Loads the data into the end target (database and/or cubes) in a presentation-ready format for end users to make decisions.

    Extract from source

    The ETL process starts with extracting data from different internal and external sources. Each data source has its distinct set of characteristics. For example, data from point-of-sale, inventory management, production control and general ledger systems are often logically and physically incompatible. In general, the goal of the extraction phase is to effectively extract source systems that have different DBMSs, operating systems, hardware, communications protocols and so on, to convert the disparate and/or heterogeneous source data into a single format appropriate for transformation processing.

    A decision may be made by the ETL architect to store data in a physical staging area with the same structure as the source versus processing it in memory. The challenge to achieve the optimal balance between writing data to staging tables and keeping it in memory during the ETL process comes from two conflicting objectives:

    • Load the data from the source to the target as fast as possible;
    • Be able to recover from failure without restarting from the beginning of the process.

    Transform the data

    The extract step only moves and stages data. The goal of the transformation stage is to change the data to make it usable for the intended purposes.

    Once the data is available in the staging area or processed in memory, it is all on one platform and one database. Various transformations such as joining and union tables, filtering and sort the data using specific attributes, pivoting to another structure, making business calculations, and creating aggregates or disaggregates, can be easily performed. In this step of the ETL process, data is cleaned to remove errors, business rules are applied, and data is checked for integrity and quality. If data validation fails, it may result in exceptions, thus not all data is handed over to the next step. After having all the data prepared, slowly changing dimensions may be chosen to be implemented. In that case the business can keep track in the analysis and reports when attributes change over time. For example, this may be useful when a customer moves from one region to another.

    Load into the target

    Finally, data is loaded into the end target – usually the data warehouse – which contains fact and dimension tables. From there the data can be combined, aggregated, and loaded into data marts or cubes to support business reporting needs.

    As the load phase interacts with a database, the constraints defined in the database schema — as well as in triggers activated upon data load — apply (for example, uniqueness, referential integrity, mandatory fields), which also contribute to the overall data quality performance of the ETL process.

    The figure below displays these ETL steps.

    ETL Development Steps

    When to use ETL?

    The key function of an ETL process is data integration which enables business intelligence. Data integration allows companies to migrate, transform, and consolidate information quickly and efficiently between systems of all kinds. Pulling databases, application data and reference data into data warehouses provide businesses with visibility into their operations over time and enable management to make better decisions.

    For example, a financial institution might have information on a customer in several departments and each department might have that customer’s information listed in a different way. The membership department might list the customer by name, whereas the accounting department might list the customer by number. ETL can bundle all this data and consolidate it into a uniform presentation, such as for storing in a database or data warehouse.

    Another way that companies use ETL is to move information to another application permanently. For instance, the new application might use another database vendor and most likely a very different database schema. ETL can be used to transform the data into a format suitable for the new application to use.

    ETL tools

    Today, ETL process is widely used in business intelligence with the help of an ETL tool. Before the evolution of ETL tools, the ETL process was done manually by using SQL code created by programmers. This task was tedious and cumbersome since it involved many resources, complex coding, and more work hours. On top of it, maintaining the code placed a great challenge among the programmers.

    These difficulties are reduced by ETL tools since they are very powerful and they offer many advantages in all stages of ETL process. Starting with extraction, data cleansing, data profiling, transformation, debugging and concluding with the loading into data warehouse is streamlined and visually documenting when compared to the old method. Most modern ETL software also covers real-time and on-demand data integration in a service oriented architecture (SOA), and master data management.

    The widely used ETL tools in the ETL space are Informatica, DataStage, SQL Server Integration Services, Pentaho Data Integration, and Oracle Warehouse Builder.

    The post ETL Development appeared first on Making Data Meaningful.

    Knoyd Blog

    Time-series Analysis - Part 1

    In this two part blog post, we will show you how to analyse time-series and how to forecast future values  by Box-Jenkins methodology. As a testing dataset, we have chosen the "Monthly production of Gas in Australia". This dataset is available from datamarket for free. We have restricted data from the time span 1970 to 1995.

    We are splitting the topic like this:

    • PART 1 - Explanatory Data Analysis
    • PART 2 - Forecasting by Box-Jenkins Methodology 


    1. Explanatory data analysis 

    In the first installment, we will begin with explanatory data analysis. This part is useful for everyone who wants to visualise time-series. First we have to load our time-series into pandas data frame and create two series. First is our test series (20 years) on which we train our model and the second one is the series for the evaluating model (5 years). 


    df = pd.read_csv('gas_in_aus.csv',
                      header = 0,
                      index_col = 0,
                      parse_dates = True,
                      sep = ','
                      names = ["Gas"])
    y = df['Gas']
    y_train = y["1970" : "1990"]
    y_test = y["1991" :]



    After successfully loading our data, it is reasonable to plot the y_train series to get to know what our data looks like:




    From the plot above we can clearly see that time-series has strong seasonal and trend components. To estimate the trend component we can use a function from the pandas library called rolling_mean and plot the results. If we want to make the plot more fancy and reusable for another time-series it is a good idea to make a function. We can call this function plot_moving_average.


    def plot_moving_average(y, window=12):
        # calculate moving averages
        moving_mean = pd.rolling_mean(y, window = window)
        # plot statistics
        plt.plot(y, label='Original', color = '#00cedd')
        plt.plot(moving_mean, color = '#cb4b4b', 
                 label = 'Moving average mean')
        plt.legend(loc = 'best')
        plt.title('Moving average')
        plt.xlabel("Date", fontsize = 20)
        plt.ylabel("Million MJ", fontsize = 20) = False)


    To use this function we have to check our time-series and try to estimate the length of its period. In our case, it is 12 months. If it is not possible to make a decision about the length of the period from the plot you have to you use a discrete Fourier transformation and a periodogram to find periodic components. However, this would go off topic within this blog. After applying our function to time-series we get a plot with an estimation of the trend:




    Let's say we want to know in which months the production is the biggest. It is not clear from the original time-series. Because of that, we will look at our data from another point. We want to see our time-series as a plot which shows month on the x-axis and production on the y-axis. Each line in this plot represents one year:


    df_train['Month'] = df_train.index.strftime('%b')
    df_train['Year'] = df_train.index.year
    month_names = pd.date_range(start = '1975-01-01', periods = 12, freq = 'MS').strftime('%b') 
    df_piv_line = df_train.pivot(index = 'Month',
                                 columns = 'Year', 
                                 values = 'Gas')
    df_piv_line = df_piv_line.reindex(index = month_names)
    df_piv_line.plot(colormap = 'jet')
    plt.title('Seasonal Effect per Month', fontsize = 24)
    plt.ylabel('Million MJ')
               bbox_to_anchor=(1.0, 0.5))


    By running the code above we get:




    From this plot, it is clear that the biggest production is in winter (as expected). It is also easy to compare years with each other and see how every year increases the production of gas. Another way to look at the same data is to use box-plots:




    Conclusion - PART 1

    As you can see from the plots of this blog, time-series visualisations do not have to be just one plot with dates on the x-axis and values on the y-axis. There are many other ways to visualise time-series and get interesting hidden features from them. If you are interested in getting as much as possible from your time-series data, do not hesitate to contact us. 


    Stay tune: In the next part, we will look into forecasting future values from time-series by Box-Jenkins methodology.

    Get in touch


    Simplified Analytics

    How HR Analytics play in Digital Age

    Today every company is acting on the digital transformation or at least talking about digital transformation. While it is important to drive it by analyzing customer behavior, it is extremely...


    July 27, 2018

    Revolution Analytics

    Because it's Friday: Street Orientation

    Most cities in the US have a grid-based street structure. But it's rarely a perfect grid: sometimes the vagaries of history, geography, or convenience lead to deviations from right angles. And...


    Revolution Analytics

    aRt with code

    Looking for something original to decorate your wall? Art With Code, created by Harvard University bioinformatician Jean Fan, provides a collection of R scripts to generate artistic images in the...

    InData Labs

    Deep Learning: Strengths and Challenges

    Deep learning is largely responsible for today’s growth in the use of AI. The technology has given computers extraordinary powers, such as the ability to recognize speech almost as good as a human being, a skill too tricky to code by hand. Deep learning has also transformed computer vision and dramatically improved machine translation. It...

    Запись Deep Learning: Strengths and Challenges впервые появилась InData Labs.


    July 25, 2018

    Ronald van Loon

    Artificial Intelligence in Motion

    The developments in the field of Artificial Intelligence (AI) are largely unprecedented, and have opened the doors towards a lot of new pathways. All the developments that were expected of AI, are finally proving to be true, and we have entered a stage of consistent development. Being associated with AI for much of the last decade, I have witnessed its growth over time. The technology has grown at a rate of knots, and it can safely be said that it now stands at a very crucial junction of time.

    As part of my endeavors to keep myself up to date with the changing AI ecosystem and the developments made in it, I attended the Atos Technology days in Paris. The event was a nourishing experience for me, as I got to speak to and attended keynotes with different leaders from Atos such as Thierry Breton, who is Chairman and CEO Atos, Philippe Duluc, who is CTO for big data and security (BDS), Phillipe Vannier, who is group CTO, and Arnaud Bertrand, who is the Head of BDS Strategy and Innovation.

    It felt great to be part of the conference and some interesting insights have been shared. Gartner has researched and found out that by 2021, we may have 40 percent of all consumers using smart technologies in applications. Not only this, but by 2020, it is expected that 50 percent of all BA software will be using prescriptive analytics. Keeping the bright future of AI, there was a lot to talk about.

    How to Approach AI

    The first step to approaching AI is to realize all the key ingredients of the process. AI is rapidly growing as an up and coming technology, and considering all that it has to offer, there are certainly no doubts regarding the benefits. However, before we talk about approaching AI, we will have a look at the three ingredients leading this change.


    Data is the necessary fuel of all artificial intelligence. It is the explosion of data in the previous years that has helped us into this age of AI. Without the unprecedented wave of data dictating the way for us, we wouldn’t be able to implement so many Machine Learning (ML) techniques, get predictive analysis, and implement changes. Data hence provided the raw material that AI needed to develop.


    Knowledge or algorithms play an important part in AI as well. While data is extremely important in itself, it is the knowledge that we use to extract sense from it that dictates the way into the future. Going into the future requires the implementation of unprecedented changes and learning methods. Machine Learning and other methods have given data the platform it needed to become a source of artificial intelligence.

    Computing Power 

    Computing power is today at the heart of this revolution. Neural networks have existed since the 90s, but it is the power of computers in the world today that has led to a bigger change in AI. The fast pace of Quantum computing can be accredited with the change here, as it has made the smooth running of heavy Machine Learning systems possible.

    The world is growing hybrid, combining traditional IT, private, managed and public clouds, and Atos Technology have truly captured the essence by creating a unique, hybrid experience. I spoke to Arnaud Bertrand from Atos about their endeavors in this regard, and he mentioned that the hybrid experience promised by Atos includes an amalgamation of on-site computing, private cloud, and edge computing. To ensure the security of data on the cloud, it is necessary that you incorporate the data with other solutions, to create a hybrid setting.

    AI Use Cases

    There are numerous use cases of how Atos has redefined the AI experience by creating the perfect mix of AI offerings for their clients. The following use cases will help explain their services in AI better:

    Connected Cooler 

    Atos has the honor of being Coca-Cola Hellenic Bottling Company’s official IoT partner. They recently went into an agreement over delivering more than 300,000 connected coolers to them by the end of 2018. These coolers are to be installed in nearly 30 countries across the globe, and the main goal they are supposed to achieve is to help customers out with their routine Coca-Cola vending machine/connected cooler experience.

    What the connected cooler brings to the picture here for Coca-Cola is:

    • It achieves unprecedented efficiency for Coca-Cola by enhanced methods of predictive maintenance and placement.
    • It improves inventory, product placement, and stock optimization by following interactive AI methods to serve the purpose.
    • It increases sales by linking up targeted promotions with the connected consumers.

    By promoting connectivity, the connected cooler will be giving Coca-Cola a chance to achieve brilliant success here.

    Prescriptive Maintenance

    The State Department of Virginia wants to protect the technology infrastructure within the state. They plan to do this through the next generation of AI powered cybersecurity solutions. The solution will be extremely helpful in identifying many future attacks and limiting them to an extent, where they don’t possess a potent threat anymore. The solution covers many different aspects, including threat detection to access point security and vulnerability management.

    Cybersecurity has always been a tough task to manage, as isolated intrusions have been hard to detect for most systems. However, since these tasks have grown over the last few years, the need for a system that recognizes the attack and uses the information for further detection was felt. The prescriptive Security Operation Center detects all the signals left by such attacks and alerts security managers about possible risk areas, even before the attack happens. This not only gives insight into how cyber attacks are most often carried out, but also helps companies stop cyber attacks from hindering their services.

    Digital Twin

    Atos signed their global strategic partnership with Siemens in 2011. Ever since then, they have joined hands to market the MindSphere platform by Siemens. The platform is basically a cloud based operating system that gives customers the freedom to connect their physical infrastructure and legacy systems to the digital world.

    By Digital Twin Technology, manufacturers can create a real-time digital replica of all physical assets for comparing and analyzing them in the future. This gives manufacturers the ability to find new ways for improving the production process. Automotive engineers could benefit from the twin technology by creating a prototype of the car inside the digital world, rather than in the physical world. Only when all testing has been done online, would they feel the need for physical testing.

    Atos’s efforts to put AI in motion have helped their clients go a long way, and they endeavor to create such solutions going into the future as well. You can learn more about the possibilities of AI by watching videos of keynotes from the event.


    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post Artificial Intelligence in Motion appeared first on Ronald van Loons.


    July 24, 2018

    Big Data University

    React on Rails Tutorial: Integrating React and Ruby on Rails 5.2

    Users expect a certain level of interactivity and speed when using websites, which can be hard to provide with server rendered websites. With a regular Rails project, we can sprinkle interactivity on the client side with vanilla javascript or jQuery but it quickly becomes tedious to maintain and work with for complex user interfaces.

    In this tutorial, we’re going to look at integrating React into an existing Rails 5.2 app with the react on rails gem, in order to provide an optimal user experience and keep our codebase clean at the same time.

    Suppose you are running your own app store called AppGarage. Users are able to see popular apps, download them, and search for new apps.

    Currently, the website is built only with Ruby on Rails so users have to type the whole search term, submit, and wait for a page refresh before seeing the search results.

    Users expect content to be loaded as they type so that they can find apps faster. Wouldn’t it be nice to upgrade our search functionality to dynamically fetch and render search results as the user types their query? Let’s do that with React!

    Table of Contents


    This tutorial assumes a basic understanding of Git, Ruby on Rails, and React/JavaScript.

    Ensure the followings are installed on your device:

    • Ruby on Rails v5.2 or greater
    • Git
    • Node/NPM/Yarn

    Initial Setup


    We begin by cloning the repository for our project from GitHub which includes the entire static website built with Ruby on Rails and no react integration.

    Use the following command to clone the repository:

    $ git clone

    After cloning, enter the app-garage folder:

    $ cd app-garage

    Migrate and Seed the Database

    Now that we pulled down the code for our project, we must prepare rails by migrating and seeding our database:

    $ rails db:migrate && rails db:seed

    Start Server

    Our database now has the correct schema and is seeded with initial sample data in order to easily visualize our code changes. We can now start the rails server (Note: it may take rails a while to start the server on its first run):

    $ rails server

    You can now head over to http://localhost:3000 and you’ll see that our base application is working. We can view the homepage, search for apps, and view specific apps.

    Website Screenshot


    Now that we have a working web app, we’re ready to improve it by integrating React on Rails and modifying the search functionality.

    Installing React on Rails

    Note: If you’re following this tutorial using a different existing Rails app or if you’re using a Rails version older than 5.1 you should take a look at the official react-on-rails documentation for integrating with existing rails projects.

    Adding and Installing Gems

    First, we must add the webpacker,  react_on_rails and mini_racer gems. Edit the Gemfile and add the following to the bottom of the file:

    After adding the gems to the Gemfile, install them with the following command:

    $ bundle install

    Setting up Webpacker and React

    Now that the required gems are installed, we can begin configuration. First, we configure Webpacker by running:

    $ bundle exec rails webpacker:install

    Now that webpacker is configured, we install and configure React:

    $ bundle exec rails webpacker:install:react

    We should now see the following in our terminal:

    Webpacker now supports react.js 🎉

    Note: We can delete the autogenerated sample file: app/javascript/packs/hello_react.jsx

    Setting up React on Rails

    Currently, our project has Webpacker and supports React but we do not have an integration between React and Ruby on Rails. We need to add our changes to version control, so we add all of our changes and commit with the following command (Note: it’s important to commit our changes otherwise we will get warnings when continuing the tutorial):

    $ git add . && git commit -m "Add webpacker & react"

    Add the react-dom and react_on_rails packages to our package.json by running:

    $ yarn add react-dom react-on-rails

    Now create config/initializers/react_on_rails.rb with the following content:

    We’re now ready to start writing JavaScript and React components.

    Implementing the Search Component

    Starting simple, we’re going to take our current search view and have it render as a React component without changing any functionality.

    Create the following structure in your application folder: app/javascript/components

    We can now create our search component called Search.jsx inside the folder we just created with the following content:

    The above is our markup for searching converted to JSX in order for React to render it as a component. Note that we changed the HTML class and autocomplete attributes to className and autoComplete respectively for JSX to properly render our markup. This is required because we are writing JSX which is a syntax extension to JavaScript.

    We now have a search component but React on Rails knows nothing about it. Whenever we create a new component that we want to use in our Rails app, we must register it with react-on-rails in order to be able to use it with the react_component rails helper. To do so, we edit the app/javascript/packs/application.js file to have the following content:

    The application.js file now serves as a way for us to register our components with react-on-rails. In our case, it’s acceptable to include our search component on every page load, but for real-life production applications, it’s not very performant to include every component on every page. In real-life applications, components would be split into webpack bundles which are loaded on pages where they are needed.

    Now we include our application bundle in our layout on every page by editing app/views/layouts/application.html.erbto have the following content:

    Now, we’ll replace our homepage markup with the react-on-rails react_component helper to render our Search component by editing app/views/home/index.html.erb to have the following content:

    Adding React Functionality to our Replacement

    Our search is now rendered as a react component but all of our functionality has remained the same, the only difference is not noticeable to users yet. We’re now able to start making our search dynamic.

    We need to be able to fetch our search data as JSON but we currently don’t have a JSON endpoint for our search controller. To do this, we add the file app/views/search/index.json.jbuilder with the following content:

    Now our search data is accessible as JSON via /search.json.

    To access our search data from the client-side JavaScript, we need to add a library for fetching data asynchronously with the browser. In this tutorial, we’ll use the axios library since it also supports older browsers. To install the library, simply run the following command in your terminal:

    $ yarn add axios

    Now that we have our dependencies installed, we can begin improving our search component. We must start tracking the text written into the search field, fetching the search results for the current text, and updating the state. Here is the new content for app/javascript/components/Search.jsx:

    • To start, we defined our components state (and initial state) to include our search results and whether or not we’re currently loading/fetching new results.
    • Next, we wrote our onChange function which gets called each time the value in the search field changes. We use axios to send an http request to our new /search.json endpoint with the current search field text. Axios will either successfully fetch results in which case we update our state to include the results, or it will fail and we update our state to have no results.
    • Our render function stays almost the same. We alter the input field by adding an onChange handler and pointing to the onChange function we just wrote.

    The updated search component now dynamically stores and fetches the users search results based on the current text but doesn’t render anything related to the results yet.

    Rendering the Dynamic Search Results

    In order to render the search components state, we will create two new components that will make our code easier to manage.

    First, we create the SearchResult component which is a purely functional component with no state and it renders declaratively based on props. The prop we expect is a result which is a regular app object from our rails application. Create app/javascript/components/SearchResult.jsx with the following content:

    Now, we create a SearchResultList which is also a purely functional component in order to render our result array as SearchResult components. The SearchResultList will expect two props, the first is results, an array of our search results and the second is whether or not we’re currently loading new results. Create app/javascript/components/SearchResultList.jsx with the following content:

    Our SearchResultList will iterate through our search results and map them to render as a SearchResult component. We added a style attribute to the container in order to properly display the results under our search field.

    Now that we have our two helper components we can modify Search.jsx to render its state when the result array is not empty. Update app/javascript/components/Search.jsx with the following content:

    The changes we made to the Search component were:

    • Imported our SearchResultList component
    • Updated the render function to render the SearchResultList component when we have results or when we are loading.

    We’ve now integrated React on Rails into our Rails 5.2 app in order to have a dynamic search component for our users.


    We started with a regular rails application and went through the process of installing and configuring Webpacker, React, and React on Rails. After configuring, we replaced our search to be a react component which dynamically fetches and renders search results from a new JSON endpoint on our Rails app.

    Initial Application

    The original implementation above wasn’t a good user experience since it involved typing the full query, waiting for the page to load before seeing any results.















    Updated Implementation

    The new implementation above shows search results as the user types which saves time and provides a much better user experience.


















    We can now begin adding even more interactivity to our website by implementing additional react components and reusing existing components on other pages.

    Preview Final Version

    You can preview the final version by following these steps:

    1. Clone the final-version branch of the GitHub Repository
      $ git clone -b final-version
    2. Enter the newly cloned app-garage folder:
      $ cd app-garage
    3. Run the necessary setup commands:
      $ yarn && rails db:migrate && rails db:seed
    4. Start the rails server and navigate to http://localhost:3000 Note: Initial load may time a while.
      $ rails server

    Further Reading

    If you want to learn more about integrating react with Ruby on Rails (such as proper state management with Redux or handling bundles for specific pages), the repository and documentation for react on rails is a great place to look.

    The post React on Rails Tutorial: Integrating React and Ruby on Rails 5.2 appeared first on Cognitive Class.

    Revolution Analytics

    A quick tour of AI services in Azure

    If you're after a quick overview of some of the services available in Azure to build AI-enabled applications, you might want to check out the 6-minute video below. It provides a tour of three...

    Ronald van Loon

    AI Meets Industrial IOT: Power Your Data-Driven Enterprise

    The rapidly progressing Artificial Intelligence (AI) technology offers a lot of new opportunities to businesses in the industrial sector. Smart machines that are capable of performing repetitive tasks with accuracy and self-correcting errors would be the perfect solution for any factory involved in large scale production.

    At the same time, the Internet of Things (IoT) technology can improve efficiency, scalability, and connectivity for industrial organizations while saving time and costs. Companies have started applying the sensing technology to improve workplace safety and operational efficiency, while reducing the cost of unnecessary maintenance.

    The implementation of AI and IoT together will give companies a competitive advantage and help build the data-driven businesses of tomorrow.

    Potential of AI & IoT in the Industrial Sector

    The combined application of the two technologies is expected to extend data analytics beyond answering questions we have today, to solve unexplored questions, quickly and effortlessly. Combined AI and IoT predictive modeling used by data analysts would allow businesses to transition from descriptive analytics (i.e., there is a problem) to prescriptive analytics (i.e., here is how to solve the problem.)

    We discussed the power of AI & IoT for industries with Steve Fearn, Chief Technologist at Hewlett-Packard Enterprise Group (HPE). Fearn explained that manufacturing concerns can improve their production processes by employing machine learning to different phases of the assembly process.

    Fearn gave the example of a video analytics tool that retrieves data from the manufacturing execution system and compares it with the video image that it sees on the assembly belt to ensure that products match customer orders.

    Often, manufacturers create the same product for all customers and this uniformity creates long production runs and saves costs. However, more and more buyers are demanding highly customized products that differ in terms of size, style, shape, design, and color.

    The application of IoT allows manufacturers to precisely create goods that are highly customized, yet produced on a large scale, similar to homogenous production.

    The video analytics system can verify the slight variations in products. Quality assurance through video can also check for any misaligned products, missing information, or faulty products. Ultimately, video analytics allows for a very high level of quality assurance without the need trained individuals for the job.

    Managing the Digital Transformation

    While digital transformation with IoT and AI can automate answers to old issues and solve new questions, one of the main challenges is to managing evolution from current production systems to ones based upon AI and IoT. Overcoming the challenge will require a singular effort by all groups developing, deploying, and using the system. This requires a systematic approach of identifying the current situation, highlighting the challenges, finding ways to resolve problems, and delivering results that give businesses a competitive advantage.

    Current Status

    The current system of data analytics allows businesses to collect a huge volume of information about their customers, products, or business processes. However, there is no common framework for gathering and structuring data in a silo that can be shared across all the departments of the organization.

    This leads to a number of problems for businesses.

    • Production Inefficiencies

    Manufacturing businesses often miss opportunities for reducing the cost of production due to incomplete data sharing and analysis.

    • Design Flaws

    When the data about customer preferences or needs isn’t shared across all levels of production, the final product may contain features that are not needed, or it could be missing features that are required by the client.

    • Marketing Opportunities Missed

    When customer demand and expectation data is inaccurate, businesses often miss opportunities in the market by producing less than their marketing team can sell.

    • Customer Response

    The lack of shared data results in delays when responding to defects reported by customers. A delay could also lower the quality of customer service.

    Implementing Change

    Businesses looking to apply the emerging technologies of AI and IoT would also need to consider impacts on current employees. As with any new technology, Industrialized IoT would drastically improve production efficiency, but it could also result in job changes or even job loss.

    However, Doug Smith, the CEO for Texmark Chemicals, took a different and perhaps even more effective approach: envisioning technology as a catalyst to empower employees to specialize and do more, rather than merely replacing them. Doug Smith notes that when his company began to implement IoT for production of chemicals, he made sure that operators and other employees were involved in the transformation initiative from day 1. One of the most important parts of the IoT journey for Texmark Chemicals was the involvement of the people who actually do the work in the plant. Without their support, input and vetting, it would have been impossible to introduce the sensor-enabled assets and other IoT technologies into their business.

    Texmark is seen as a leader in innovation and digital transformation – it is called The Refinery of the Future – for the petrochemical industry. The company is not just talking about implementing the AI and IoT, but is in the process of implementing cutting-edge technologies that increase revenue, efficiency, safety, security and productivity, all positive impacts to the organization and the bottom line.

    Resolving Challenges

    One way to improve the data processing for industries is through edge analytics. Edge analytics improve the process of data collection and speed of analysis by performing these functions where the data is gathered.

    Eddy Biesemans, the global account manager for Schneider Electric, explained that big data and edge analytics would allow more complex calculations to take place in real-time while removing the problem of latency, saving costs on bandwidth usage and increasing security.

    For instance, consider an airplane manufacturer looking to enhance their product’s performance by flying it for 2 million miles. It would simply be lengthy and inefficient for the manufacturer to achieve this goal.

    However, the manufacturer could install sensors on a hundred airplanes that collect and process different performance variables as and when the aircraft is flying. Instead of transferring data to the centralized analytics system, the analytics model would be executed as the data is generating during a test.

    Delivering Results

    The first rule of statistics is plot the data, as much can be inferred by simply look at the data. Similarly manufacturing businesses can achieve great results by developing a framework that leverages the high volume of data stored in data silos across different departments. The central idea is to take the data generated by different departments, such as sales, production, purchasing, etc., and make it accessible to all the participants in the organization.

    For an auto manufacturer, data would follow the design or product lifecycle. The data cycle would begin with R&D, prototype, and component designing. Batch and performance data would be captured during manufacturing and quality assurance, shared across organization. The output from these functions would be used by sales to alert customers of shipments as well as after sales service to improve support, especially issues related to specific batches. Data generated by customers could be looped back into R&D.

    This would create a continuous, virtual loop where all the departments would be involved. The data sharing across different organizational departments is aptly named closed-loop manufacturing.

    Closed Loop Manufacturing

    Closed loop manufacturing has the potential to become the industry standard for manufacturing businesses. The goal of the framework is to capture complete data about products such as how they are used and in what context, performance, demand fluctuations based on pricing, as well as customer rating.

    Closed loop manufacturing would give industrial businesses four main advantages.

    • It would help businesses identify the features that are critical to producing components of a high quality.
    • It means starting small but thinking big. A company can transform step by step without putting production at risk.
    • Engineers would be able to carry out a powerful root cause analysis with the framework, as it encompasses data across the lifecycle, not just from design or production.
    • The framework can be continuously improved as the results from one cycle are used as input for the next stage.

    The Outcome

    Manufacturing businesses can take three lessons from the developments of new business processes.

    • Progress in AI and IoT will significantly change the production capabilities for.
    • Data visibility across the organization will improve a manufacturer’s response to defects and quality issues, and promote faster response to customers.
    • The closed-loop manufacturing will significantly improve production and transform businesses through continuous improvement.

    The key take away is to think big, but start small and learn through the process – and surround yourself with the right ecosystem of partners.

    Hewlett Packard Enterprise will be the event lead at the upcoming Industry of Things Worldevent in Berlin, Germany, taking place on September 23 through 25. If you are planning to attend, don’t forget to stop by their booth and learn more about Industrial IoT solutions and AI.


    HPE Industrial IoT solutions

    HPE IoT Blogs & Community

    HPE Edgeline Converged Edge Systems



    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post AI Meets Industrial IOT: Power Your Data-Driven Enterprise appeared first on Ronald van Loons.


    July 23, 2018

    Revolution Analytics

    Highlights from the useR! 2018 conference in Brisbane

    The fourteenth annual worldwide R user conference, useR!2018, was held last week in Brisbane, Australia and it was an outstanding success. The conference attracted around 600 users from around the...


    Revolution Analytics

    AI, Machine Learning and Data Science Roundup: July 2018

    A monthly roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications I've...


    July 21, 2018

    Making Data Meaningful

    Towards Location-Based Analytics

    Juan Huerta is a contributing author to Making Data Meaningful. He is currently a Senior Data Scientist at PlaceIQ where he focuses on location-based analytics. Juan was a speaker at the 2013 Business Intelligence Symposium where he spoke on his work of taking large amounts of structured and unstructured data and how he extracts patterns, trends, intelligence and context from this data. He holds a PhD from Carnegie Melon University and resides in the Greater New York City area.


    The availability of data incorporating location information is growing. This influx of data has been affected by the emergence and broad adoption of mobile, the intersection of diverse streams of information, the abundance of data-generating and location-enabled devices, and the availability of tools and techniques to extract insights from this type of data, among other things.

    Location-annotated data is increasingly abundant.

    In addition to its abundance, location information has proven its value as a proxy for human behavior. Location is a primary marker of consumer intent, both at an individual and segment level. The ebbs and flows resulting from constant movement of mobile devices provide us with a picture from which patterns and insights can be extracted.

    Because of these characteristics, it is not surprising that there is an increasing interest across industries in attaining movement-based consumer insights. Marketers, analysts, and decision-makers are realizing the value of this type of data in delivering new types of consumer insights. The possibilities promised by the juxtaposition of information streams relating location, movement, demographics and behavior, are truly exciting.

    At the same time, because of its particular dynamic and large-scale nature, the mechanisms, tools and abstractions available for general data do not seem to suffice. Customized approaches to leverage this data are necessary.

    Here are a few considerations we need to make when approaching this domain:

    Data: We must consider the nature of the data. More specifically, where is the location-related data coming from? To better understand this data, we can categorize it into two types – static and dynamic (i.e., movement data). Static data includes census-related data, satellite photography and maps, business listings, and so on. Dynamic data includes events that occur and are registered as consumers move around in their daily lives. The most important source of dynamic information is the data generated by mobile devices. As ad requests are generated, devices leave a “digital footprint,” or trace, of ad-requests generated. Billions of these requests are generated daily by mobile devices. That’s some big data.

    Framework of reference: After provisioning the data, we need a way of organizing it. Given its enormous volume and heterogeneous nature, how can one make sense of it all without being overwhelmed? To help us in this task, a grid-based frame of reference is very useful. In this way we can aggregate, or tally, the location types that are located within a tile. Static information is anchored in this grid, while dynamic information is overlaid on this grid. It is easier to perform searching and averaging with this system. Similarly, time can be discretized.

    Abstractions: We must have the adequate abstractions to join the dynamic data with the static data. One powerful abstraction is the audience. An audience is a segment of the population with homogeneous patterns of behavior. Naturally, when working with location information, location should substantially inform our audience design. Audiences, in turn, can be designed and characterized by the information they convey – for example, audiences based on location categories, audiences based on behaviors, and audiences enriched by the use of contextual features.

    Algorithms, Metrics and Analysis: After the data and audiences are in place, it is necessary to have the right location-information processing algorithmic pipelines as well as adequate metrics. In the case of location, an example of a very powerful measurement is Place Visit Rate (PVR™), which measures the percentage of people that visit a location of interest during a given period of time. When applied to marketing campaigns, we can focus on the PVR™ lift that a campaign attains.

    Bringing it all together: Once the right data, abstractions, and algorithms are in place, it is possible to address questions that were once difficult – or even impossible – to answer. For example, if we were interested in understanding and analyzing the drivers to purchase for a particular retailer, we could focus on measuring and characterizing the PreVisit behaviors in terms of audiences (i.e., “where was this particular group observed before they shopped at X”). Additionally, we could focus on response rates in terms of PVR™ lift for different behavioral audiences.

    Not only does location-based analytics enable us to answer these types of questions, but it also opens the door to many new types of analysis. The possibilities are truly endless.

    The post Towards Location-Based Analytics appeared first on Making Data Meaningful.

    Making Data Meaningful

    Gain New Insight From Existing Data

    The company is doing well in this difficult economy, but leaders are looking for new ways to grow revenue and increase profits. That stack of old printed reports on their desks just weren’t cutting it anymore so the company has invested in a new data visualization system that provides more insight into operations than the old, standard reporting. The visualizations, dashboards, data mining and other tools allow analysts to discover the golden nuggets that have been hidden and overlooked. These new ways of looking at the company’s performance will allow the business leaders to set strategic goals, make critical decisions, and expose potential problems before they affect the bottom line.

    Old Reports from "Gain New Insight From Existing Data"

    Do My New Tools Tell the Same Story?

    This is a big step forward for any company, but how do you gain trust in the new analytics? How do you know that those first results are pointing the company in the right direction? Before using the power of visual analytics to plan the future, take a step back and analyze the past. Take the results of the last five years. The bottom-line has already been determined. Positive and negative performance has been reviewed. Do the new tools tell the same story? Every company has a series of questions they ask to determine overall company performance, set direction, and propose change. Try to answer these questions using the new tools. Do the answers to those questions still hold true or is the new analysis painting a different picture of company operations? If the new analysis is producing the same answers, then the direction should be true. If different answers are uncovered, then it might be time to step back and re-evaluate the direction before getting lost.

    What Insights Can I Gain From My Existing Data?

    One universal company goal is to find new ways to grow the business. All that data is just sitting there waiting to tell a story. Use these new tools to do data mining and create visualizations of the data to provide different perspectives. What new trends or insight can be found in the existing data that were overlooked in the old reporting? The business should take some time and determine new questions and use visualization to provide answers:

    • Dashboard Reporting from "Gain New Insight From Existing Data"Is there something common among the top customers that can drive more business?
    • Is there something common among the bottom customers that can be improved to increase business?
    • How does geographic location influence sales?
    • How should marketing advertise based on population density in major markets?
    • What is my client distribution across my product lines?

    The point is that to successfully grow future business one should also understand past business. Spend time understanding what was successful and build on it. Understand what did not go according to plan and learn from those mistakes. Spend time analyzing the existing data in new and different ways. Think outside the box and those golden nuggets might just make their way to the surface.


    The post Gain New Insight From Existing Data appeared first on Making Data Meaningful.


    July 20, 2018

    Making Data Meaningful

    Big Data – The 4 V’s: What Was Old is New Again; Part 1

    In April 2012, (a social media monitoring company) published a 1,211% increase in use of the term “Big Data,” from March 2011 to March 2012 in a survey of English Social Media Channels. Big Data is certainly one of the key buzzwords of our time.

    In a 2001 METAGroup article, Doug Laney presented the three “V”s of Big Data: Volume, Velocity and Variety. Others have added multiple fourth “V”s such as Vulnerability, Veracity and Value. None of these contributes to the fundamental definition. They are consequential.

    When people think of Big Data they often focus on the first “V”, volume; after all, it is called Big Data; but, large data volumes are nothing new. Data has always been “big” relative to the technology to make use of it.

    The original Big Data was the Library at Alexandria, which contained the combined experiences and learnings of ten centuries. In 1944, the concern was that American University libraries were doubling in size every 16 years and that the number of published volumes would outpace the ability to physically store them, let alone access and derive value from them.

    Data has always been big, but never nearly as massive as it is today. For over a decade, we have heard about the early pioneers of this generation’s big data: Wal*Mart, Google, eBay, Amazon, the Human Genome Project, and the new trailblazers such as Internet giants Facebook, Twitter, eHarmony, and comScore. Additionally there are ubiquitous sensor based data generators in Hospital Intensive Care Units, Radio Frequency IDs tracking products and assets, GPS systems, smart meters, factory production lines, satellites and meteorology, the list continues to grow.

    Market research firm IDC estimated that 1,800 exabytes of data would be generated in the year 2011. An exabyte is a unit of information equal to one quintillion (1018 bytes), or one billion gigabytes. Estimates report that the world produced 14.7 exabytes of new data in 2008, triple the amount generated in 2003. Cisco systems estimates that by 2016, annual Internet traffic will create 1.3 Zettabytes (1021 bytes), or one trillion gigabytes. To put that in perspective, all the internet traffic in the years 1984 to 2012 has generated a total 1.2 Zettabytes. We will soon be generating in one year what has taken 26 years to accumulate.

    The focus on the size attribute of Big Data is understandable in the face of these statistics, and stems from limitations in the technology available at the time, to acquire, process, and deliver these large volumes of data in a reasonable amount of time to make that data meaningful to the decision makers in the business. Traditional Relational Database technologies and methods of loading, storing and retrieving data were incapable of keeping pace with the speed necessary to analyze and act on the data.

    With the advent of new storage and query technologies such as Hadoop, MapR, Cloudera, Teradata Aster, IBM Neteeza, NoSql, NuoDb, MongoDB, CouchDB, HBase, etc., volume becomes the least important of the three “V”s.

    Volume alone does not define Big Data. Big Data is more about the second and third “V”s, Velocity and Variety. Part two of this Big Data series will delve into the Velocity factor.

    The post Big Data – The 4 V’s: What Was Old is New Again; Part 1 appeared first on Making Data Meaningful.

    Revolution Analytics

    A hex sticker wall, created with R

    Bill Venables, member of the R Foundation, co-author of the Introduction to R manual, R package developer, and one of the smartest and nicest (a rare combo!) people you will ever meet, received some...


    July 18, 2018

    Big Data University

    Scaling Our Private Portals with Open edX and Docker

    Ever since we launched, Cognitive Class has hit many milestones. From name changes (raise your hand if you remember DB2 University) to our 1,000,000th learner, we’ve been through a lot.

    But in this post, I will focus on the milestones and evolution of the technical side of things, specifically how we went from a static infrastructure to a dynamic and scalable deployment of dozens of Open edX instances using Docker.

    Open edX 101

    Open edX is the open source code behind It is composed of several repositories, edx-platform being the main one. The official method of deploying an Open edX instance is by using the configuration repo which uses Ansible playbooks to automate the installation. This method requires access to a server where you run the Ansible playbook. Once everything is done you will have a brand new Open edX deployment at your disposal.

    This is how we run, our public website, since we migrated from a Moodle deployment to Open edX in 2015. It has served us well, as we are able to serve hundreds of concurrent learners over 70 courses every day.

    But this strategy didn’t come without its challenges:

    • Open edX mainly targets Amazon’s AWS services and we run our infrastructure on IBM Cloud.
    • Deploying a new instance requires creating a new virtual machine.
    • Open edX reads configurations from JSON files stored in the server, and each instance must keep these files synchronized.

    While we were able to overcome these in a large single deployment, they would be much harder to manage for our new offering, the Cognitive Class Private Portals.

    Cognitive Class for business

    When presenting to other companies, we often hear the same question: “how can I make this content available to my employees?“. That was the main motivation behind our Private Portals offer.

    A Private Portal represents a dedicated deployment created specifically for a client. From a technical perspective, this new offering would require us to spin up new deployments quickly and on-demand. Going back to the points highlighted earlier, numbers two and three are especially challenging as the number of deployments grows.

    Creating and configuring a new VM for each deployment is a slow and costly process. And if a particular Portal outgrows its resources, we would have to find a way to scale it and manage its configuration across multiple VMs.

    Enter Docker

    At the same time, we were experiencing a similar demand in our Virtual Labs infrastructure, where the use of hundreds of VMs was becoming unbearable. The team started to investigate and implement a solution based on Docker.

    The main benefits of Docker for us were twofold:

    • Increase server usage density;
    • Isolate services processes and files from each other.

    These benefits are deeply related: since each container manages its own runtime and files we are able to easily run different pieces of software on the same server without them interfering with each other. We do so with a much lower overhead compared to VMs since Docker provides a lightweight isolation between them.

    By increasing usage density, we are able to run thousands of containers in a handful of larger servers that could pre-provisioned ahead of time instead of having to manage thousands of smaller instances.

    For our Private Portals offering this means that a new deployment can be ready to be used in minutes. The underlying infrastructure is already in place so we just need to start some containers, which is a much faster process.

    Herding containers with Rancher

    Docker in and of itself is a fantastic technology but for a highly scalable distributed production environment, you need something on top of it to manage your containers’ lifecycle. Here at Cognitive Class, we decided to use Rancher for this, since it allows us to abstract our infrastructure and focus on the application itself.

    In a nutshell, Rancher organizes containers into services and services are grouped into stacks. Stacks are deployed to environments, and environments have hosts, which are the underlying servers where containers are eventually started. Rancher takes care of creating a private network across all the hosts so they can communicate securely with each other.

    Schematic of how Rancher is organized

    Getting everything together

    Our Portals are organized in a micro-services architecture and grouped together in Rancher as a stack. Open edX is the main component and itself broken into smaller services. On top of Open edX we have several other components that provide additional functionalities to our offering. Overall this is how things look like in Rancher:

    A Private Portal stack in Rancher

    There is a lot going on here, so let’s break it down and quickly explain each piece:

    • Open edX
      • lms: this is where students access courses content
      • cms: used for authoring courses
      • forum: handles course discussions
      • nginx: serves static assets
      • rabbitmq: message queue system
    • Add-ons
      • glados: admin users interface to control and customize the Portal
      • companion-cube: API to expose extra functionalities of Open edX
      • compete: service to run data hackathons
      • learner-support: built-in learner ticket support system
      • lp-certs: issue certificates for students that complete multiple courses
    • Support services
      • cms-workers and lms-workers: execute background tasks for `lms` and `cms`
      • glados-worker: execute background tasks for `glados`
      • letsencrypt: automatically manages SSL certificates using Let’s Encrypt
      • load-balancer: routes traffic to services based on request hostname
      • mailer: proxy SMTP requests to an external server or sends emails itself otherwise
      • ops: group of containers used to run specific tasks
      • rancher-cron: starts containers following a cron-like schedule
    • Data storage
      • elasticsearch
      • memcached
      • mongo
      • mysql
      • redis

    The ops service behaves differently from the other ones, so let’s dig a bit deeper into it:

    Details of the ops service

    Here we can see that there are several containers inside ops and that they are usually not running. Some containers, like edxapp-migrations, run when the Portal is deployed but are not expected to be started again unless in special circumstances (such as if the database schema changes). Other containers, like backup, are started by rancher-cron periodically and stop once they are done.

    In both cases, we can trigger a manual start by clicking the play button. This provides us the ability to easily run important operational tasks on-demand without having to worry about SSH into specific servers and figuring out which script to run.

    Handling files

    One key aspect of Docker is that the file system is isolated per container. This means that, without proper care, you might lose important files if a container dies. The way to handle this situation is to use Docker volumes to mount local file system paths into the containers.

    Moreover, when you have multiple hosts, it is best to have a shared data layer to avoid creating implicit scheduling dependencies between containers and servers. In other words, you want your containers to have access to the same files no matter which host they are running on.

    In our infrastructure we use an IBM Cloud NFS drive that is mounted in the same path in all hosts. The NFS is responsible for storing any persistent data generated by the Portal, from database files to compiled static assets, such as images, CSS and JavaScript files.

    Each Portal has its own directory in the NFS drive and the containers mount the directory of that specific Portal. So it’s impossible for one Portal to access the files of another one.

    One of the most important file is the ansible_overrides.yml. As we mentioned at the beginning of this post, Open edX is configured using JSON files that are read when the process starts. The Ansible playbook generates these JSON files when executed.

    To propagate changes made by Portal admins on glados to the lms and cms of Open edX we mount ansible_overrides.yml into the containers. When something changes, glados can write the new values into this file and lms and cms can read them.

    We then restart the lms and cms containers which are set to run the Ansible playbook and re-generate the JSON files on start up. ansible_overrides.yml is passed as a variables file to Ansible so that any values declared in there will override the Open edX defaults.

    Overview of file structure for a Portal

    By having this shared data layer, we don’t have to worry about containers being rescheduled to another host since we are sure Docker will be able to find the proper path and mount the required volumes into the containers.


    By building on top of the lessons we learned as our platform evolved and by using the latest technologies available, we were able to build a fast, reliable and scalable solution to provide our students and clients a great learning experience.

    We covered a lot on this post and I hope you were able to learn something new today. If you are interested in learning more about our Private Portals offering fill out our application form and we will contact you.

    Happy learning.

    The post Scaling Our Private Portals with Open edX and Docker appeared first on Cognitive Class.

    Making Data Meaningful

    Hey You … Get Out of My Cloud!

    cloudDo you remember these recent stories?  On July 31, 2012 Dropbox admitted it had been hacked. (Information Week, 8/1/2012).  Hackers had gained access to an employee’s account and from there were able to access LIVE usernames and passwords which could allow them to gain access to huge amounts of personal and corporate data.  Just four days later, Wired® writer Mat Honan’s Twitter account was hacked via his Apple and Amazon accounts (story in Wired and also reported by CBS, CNN, NPR and others).

    Did you notice the common theme behind these reports?  Hackers didn’t get through the defenses of the Cloud by brute force.  Instead, they searched out weak points and exploited other vulnerabilities led to by those entry points.  In these examples – as in countless others – the weak points were processes and people.

    The Dropbox hack was made possible by an employee using the same password to access multiple corporate resources, one of which happened to be a project site which contained a “test” file of real unencrypted usernames and passwords.  Either one could be considered a lapse in judgment – I mean, who thinks it is a good idea to store unencrypted user access information on a project site??? – but added together, these lapses made a result much more dangerous than the sum of their parts.

    Mat Honan’s hack was made possible in part by process flaws at large and popular companies.  Again, each chink taken individually would likely not have been as damaging as the series of flaws building on each other.  Apple or Amazon individually didn’t provide enough information for hackers to take over Mr. Honan’s account, but taken together their processes and individual snippets of data provided the opportunity.

    My purpose in writing this isn’t to scare anyone away from the Cloud or its legitimate providers.  The Cloud is cost-effective, portable, scalable, stable, and here to stay.  And it is as secure as technology will allow.  But as these stories illustrate, technology isn’t the risk.  Information wasn’t compromised by brute-force hacking or breaking encryption algorithms.  Data was put at risk by people and processes.

    Have you ever worked with someone who messed up something royally by not following a documented process?  Or do you know someone who clicked a link in a bogus email and infected their laptop – or even the whole company – with a virus?  They might be working for your Cloud provider now.  Don’t rely on those folks to protect your data in the Cloud.  Instead, protect it yourself with Backups, Password Safety and Data Encryption before entrusting your precious data to the Cloud.  If a hacker gets into your Cloud, at least you won’t be the easiest target.

    The post Hey You … Get Out of My Cloud! appeared first on Making Data Meaningful.

    Making Data Meaningful

    Real-Time Analytics

    Real Time AnalyticsIf you have ever shopped at Amazon you may have noticed a “Featured Recommendations” section that appears after your initial visit. These recommendations get automatically updated after the system notices a change in the shopping pattern of a particular member. This is real-time analytics at work. The system is using the data at hand and coming up with suggestions in near real-time. With more companies investing into a mobile business intelligence initiative, real-time analytics is an essential requirement to ensure a good return on investment.

    I think that the implementation of a solution to get real-time analytics could be a costly endeavor. This would require implementation of technologies like Master Data Management and delivery options like cloud and/or mobile BI. Cloud BI presents its own set of security concerns, which is why some of the region’s largest companies are hesitant to implement such a solution. According to one BI manager, the company’s executives do not support the notion of putting their data into the cloud without the implementation of certain security measures. Their need for a mobile BI strategy would require security that would enable the company to delete everything from a device if it is stolen or misplaced.

    Insurance companies and retail stores can greatly benefit from such technology. The off-site sales reps will be able to see current information about potential customers including updated life changing events right on their mobile devices, which would increase the likelihood of either gaining a new customer or retaining an existing one*. In-store managers at grocery stores can get a real-time report about slow moving items allowing them to increase sales by changing displays. Real-time analytics can be on-demand where the system responds to a certain request by an insurance sales rep or it can be a continuous hourly report to the store manager of a grocery store**.

    Overall, real-time analytics gives a company a competitive advantage over its rivals but requires heavy investment into the implementation of the technology and the guarantee of proper security measures being put in place with delivery options like the cloud. This information is helpful for quick decisions, but companies should still make all major decisions by looking at historical data and studying the trends.


    *Pat Saporito, “Bring your Best”, Best’s Review, September 2011

    **Jen Cohen Crompton, “Real-Time Data Analytics: On Demand and Continuous”.  

    The post Real-Time Analytics appeared first on Making Data Meaningful.

    Making Data Meaningful

    Business Intelligence Adoption: Goal Setting

    BI AdoptionTypically, strategic goals start off as high-level initiatives that involve revenue-based targets.  Revenue targets are followed up with operational efficiency goals (or ratios) that keep expenses in line and improve profit margins.  These goals and ratios serve as the ultimate yardstick in measuring top-end strategic performance.  There may also be competitive goals that utilize different measures such as market share, product perception, etc.   Companies believe they can achieve these results based on internal and external competitive factors.  It is important to note that the internal and external factors typically drive the timing and define the tactical activities that will be employed to achieve results.

    For example, a change in government regulation may present a significant opportunity for the company that is first to capitalize on the change.  An example of an internal factor may be outstanding customer service that can serve as a market differentiator to attract and retain customers.

    These competitive factors and performance measures drive the definition of the tactical operations (or plan) needed to achieve strategic goals. Tactical operations are ultimately boiled down to human activities and assigned to managers and their employees.  Human activities impact revenue, profit, and quality.  Even quality activities ultimately impact revenue and profit.

    Example, an insurance company may excel at gathering high quality claims data that results in lower claim expenses and legal costs.

    Human activities are incorporated into an individual’s performance plan.  Before defining the human activities though, the goals, competitive factors, and tactical operations need to be gathered into a data repository.  Once gathered, they will be used to gain and communicate corporate alignment.

    Depending on your role in the organization, you may be called upon to help define and capture the financial performance ratios.  You may also be responsible for gathering and storing external factors such as survey results, industry statistics, etc.

    If all goes well, the corporation captures the revenue and performance goals and defines how performance is to be measured.  This is also communicated across the enterprise (gaining alignment).  The performance goals and target financial ratios can be stored in the corporation’s data repository.  The measuring and communicating of progress will be accomplished using a company’s reporting toolset.  The company has to decide the best frequency to communicate actual performance compared to stated goals.  This frequency can be daily, weekly, monthly, or quarterly with the emphasis on providing continual feedback.  Reporting on performance results is the first, and most basic, step in the adoption of BI practices.  Performance reporting answers the question “What happened?” (Davenport & Harris, 2007).  It is very important but only the first step.

    • Davenport, T. H., & Harris, J. G. (2007). Competing on Analytics The New Science of Winning. In T. H. Davenport, & J. G.
    • Harris, Competing on Analytics The New Science of Winning (p. 8). Boston: Harvard Business School Press.

    The post Business Intelligence Adoption: Goal Setting appeared first on Making Data Meaningful.


    Five Best Practices for Software Maintenance

    In this blog, we cover five best practices for system administrators to keep users satisfied when it comes to maintenance updates: schedule, think holistically, review urgency, test changes...


    July 17, 2018

    Making Data Meaningful

    Forecasting and Predictive Analytics

    Wikipedia defines Forecasting as the process of making statements about events whose actual outcomes (typically) have not yet been observed.

    Examples of forecasting would be predicting weather events, forecasting sales for a particular time period or predicting the outcome of a sporting event before it is played.

    Wikipedia defines Predictive Analytics as an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns.

    Examples of predictive analytics would be determining customer behavior, identifying patients that are at risk for certain medical conditions or identifying fraudulent behavior.

    Based on these definitions, forecasting and predictive analytics seem to be very similar…but are they? Let’s break it down.

    Both forecasting and predictive analytics are concerned with predicting something in the future, something that has not yet happened. However, forecasting is more concerned with predicting future events whereas predictive analytics is concerned with predicting future trends and/or behaviors.

    So, from a business perspective, forecasting would be used to determine how much of a material to buy and keep on stock based on projected sales numbers. Predictive analytics would be used to determine customer behavior like what and when are they likely to buy, how much do they spend when they do buy, and when they buy one product what else do they buy (also known as basket analysis).

    Predictive analytics can be used to drive sales promotions targeting certain customers based on the information we know about their buying behavior. Likewise, the information obtained from predictive analytics can be used to influence sales projections and forecasting models.

    Both, predictive analytics and forecasting, use data to achieve their purposes. But, it’s how they use that data that is much different.

    In forecasting, data is used to look at past performance to determine future outcomes. For instance, how much did we sell last month or how much did we sell last year at this time of year. In predictive analytics, we are looking for new trends, things that are occurring now and in the future that will affect our future business. It is more forward looking and proactive.

    So, although forecasting and predictive analytics are similar and closely related to one another, they are two distinctively different concepts. In order to be successful at either one, you have to have the right resources and tools in place to be able to extract, transform and present the data in a timely manner and in a meaningful way.

    A common problem in business today is people spend much more time preparing and presenting information than they do actually determining what the data is telling them about their business. This is because they don’t have the right resources and tools in place.

    At Making Data Meaningful we have the resources, strategies and tools to help businesses access, manage, transform and present their data in a meaningful way. If you would like to learn more about how we can help your business, visit our website or contact us today.

    The post Forecasting and Predictive Analytics appeared first on Making Data Meaningful.

    Making Data Meaningful

    MicroStrategy: Scalable Yet Agile

    Are you looking for an analytics tool that is simple enough to get up and running fast and has the capability to keep up with your company as its Business Intelligence requirements mature? If so, you will want to check out the new offerings by MicroStrategy.

    Industry Leading Analytics – Enterprise Capable

    MicroStrategy has long been known for its large scale enterprise reporting and analytics solutions.  They have also been the leaders in offering analytics on the mobile platform.

    MicroStrategy’s best-in-class capabilities have traditionally been expensive to purchase and require expert technical assistance to implement and maintain. Large organizations are able to realize economies of scale but small and medium sized companies may find it difficult to justify an initial large investment in software, resources, and infrastructure.

    To overcome an initial software investment, MicroStrategy does offer a free enterprise version of its analytics suite for up to 10 users, called Analytics Suite.  This 10-user license provides the opportunity to try-before-you-buy before rolling out to the larger enterprise.  This product can be hosted on-premise or on MicroStrategy’s cloud.

    Companies still have to develop internal resources to handle security, data architecture, metadata, and reporting requirements.

    Competition from the Edges

    In recent years companies like MicroStrategy, Cognos, SAP, and Oracle have lost ground to smaller, more agile, startups like Tableau and Qlikview. The newer companies have made it faster (and easier) to get up and running with appealing visualizations.

    These smaller companies are now trying to make their products scalable with respect to handling complex security, data architecture, and metadata requirements that are part of all mid- to large-sized implementations.

    MicroStrategy’s Response

    MicroStrategy has responded to the competition by offering two smaller-scale solutions that can be implemented in a matter of weeks: Analytics Desktop and Analytics Express.

    Personal-Sized Analytics

    As its name implies, Analytics Desktop is installed on an individual’s computer.  This product can attach to a variety of relational, columnar, and map reduce databases.  It can also attach to Excel worksheets and SalesForce data.  Analytics Desktop is designed for individual data discovery, possesses some advanced analytics capabilities, and can publish personal interactive dashboards.  Data sharing is limited to distribution via PDF, Spreadsheet export, image files, and distributable flash files.  Best of all, Analytics Desktop is free and offers free online training.

    Department-Level Solution

    Analytics Express has all of the features of Desktop except that it is hosted completely in MicroStrategy’s cloud environment.  There are no internal MicroStrategy hardware requirements.  A secure VPN connection between MicroStrategy’s cloud and your company’s firewall can be configured to protect your data.  The cloud-based analytics solution can import data from your organization’s back-end databases and refresh the data on a regularly scheduled basis.  Importing the data provides the benefit of much improved analytics performance and data availability.

    It’s a Mobile World

    Additional visualization options are available plus the ability to deploy solutions tailored for the iPad.  Access to Drop Box and Google Drive are also available.

    Enterprise-Level Security

    Security features include user authentication, single-sign-on, user and user group management, row-level security, dashboard level security, and user-based data filtering.

    Analytics Everywhere

    Dashboards can be embedded in other web pages or on intranet sites.  Visualizations can also be scheduled for email distribution.

    Deploy Before You Buy

    Organizational risk is minimized because MicroStrategy offers a free one-year trial of Analytics Express.  With all architecture hosted in the cloud, your organization won’t have to belly up any hardware or technical resources to support this product either.

    Agile and Scalable

    Both the Desktop and Express editions benefited from an improved web-based user interface designed to make the creation of dashboards easier.  MicroStrategy also leveraged its extensive portfolio of Enterprise-Level features by making them available in the hosted solution.  This ensures that MicroStrategy can meet the ever-evolving Business Intelligence and Analytics needs of your organization.

    The post MicroStrategy: Scalable Yet Agile appeared first on Making Data Meaningful.

    Making Data Meaningful

    Organizing Large Projects – How to Avoid “Death by Meeting”

    When I first heard the expression “Death by Meeting”, I thought it was the latest Stephen King novel, but after being the project manager of a project where I was expected to be involved in 20 meetings per week, dying seemed like a welcome alternative.  You can avoid this slow, painful death by creating a project structure that focuses efforts and communications and reduces meetings.

    In addition to the typical project management issues associated with the multitude of tasks required for large projects, there is a significant challenge in creating an efficient, effective project structure that drives the project effort to the correct worker-bee level and enables good project status communications, but streamlines the number of meetings required to achieve these goals.  One approach that has worked for me is the use of Project Workgroups.

    Most large projects consist of numerous tasks that can usually be grouped together in some manner.  These groupings may be by departmental function (Finance, IT, Purchasing, etc.) by activity (sales, development, implementation, training, etc.), by deliverable (software release, management reporting, etc.), or perhaps some other logical division.  Regardless of the grouping, there will be common goals and activities that will enable creation of workgroups reflecting these goals.

    Once you have determined some logical workgroups, the next step is to define a project team structure.  At the top of the structure is the Steering Committee.  This is the group that is made up of senior management who are the key stakeholders for the project.  The role of this group is to provide high-level direction, provide resources (monetary and personnel), and resolve major roadblocks to the success of the project.  Steering Committees may oversee multiple concurrent projects, and will meet on a monthly or quarterly basis.

    At this level, the Steering Committee members want to know where the project stands in terms of schedule, budget, and final deliverables.  A fantastic tool for providing the Steering Committee this information is via a project dashboard.  This dashboard should consist of a few key measurements with a status of each, using easy-to-read indicators like traffic lights or gauges.  Here is an example:

    This dashboard eliminates the need for developing voluminous detailed reports, and provides for exception level discussions.  Only items that are yellow or red require explanation, so meetings are focused and their lengths are minimized.

    The next level down from the Steering Committee is the Project Management Team, sometimes referred to as the Project Core Team.  This team consists of key middle-management personnel representative of the primary functional areas affected by the project.  The Core team should meet weekly or bi-weekly and is responsible for the direct management of the project activities.  The RAID (Risks, Action Items, Issues, Decisions) document I referenced in my previous blog is the perfect communications tool for the Core Team.  It provides a clear, concise mechanism for letting the team members see the critical items that require their attention.

    The next level of the project organization below the Project Core Team contains the working groups for the project.  The makeup of the workgroups will vary by project; however, this is the level where the daily tasks of the project are managed.  This is the level that can bring you closest to a near-death experience since the number of teams and meetings is highest here.

    Analyze your project and its deliverables to determine the best method for defining the workgroups.  An excellent place to start is with the desired deliverables since it is difficult to split a single deliverable across workgroups.  Another factor to consider is inter-departmental dependencies.  Departments that closely interact with each other and/or are dependent upon each other can be combined on a workgroup to leverage that interdependency.

    Meetings at this level of the project team need to be at least weekly.  As above, the RAID document can be used to focus and track activities of the group, and facilitate communications to the project manager and the Project Core Team.  If the tracking and reporting mechanism is standardized, then the project manager does not have to participate in all of these meetings.  Focus the workgroups on the RAID documents and they will drive the agendas and reports so that meeting death takes a holiday!

    In summary, to avoid the prospect of having the next project you manage being the planning of your own funeral after a painful “death by meeting” experience, try using the techniques described in this article.  By constructing a project team structure as described, you can keep all the affected parties updated, involved, and focused in a manner that streamlines communications, maximizes resources, and minimizes wasteful meetings.  The use of standardized task tracking and reporting tools will enable you as project manager to have visibility of all the project workgroups’ activities, and provide you the tools necessary to drive the project home successfully.

    The post Organizing Large Projects – How to Avoid “Death by Meeting” appeared first on Making Data Meaningful.

    Making Data Meaningful

    Internet of People: An Analysis

    I recently read an article by Strategy& on the Internet of Things (IoT) entitled “A Strategist’s Guide to the Internet of Things”.  This article begins with the current state of electronic world.  There will be 50 billion Internet responsive devices by 2020, and only a third will be smart devices. The rest will be…well, hard to put into a single word.  They will range from smart appliances to RFID (Radio-Frequency Identification) chips and populate nearly everything so long as there is meaningful information worth mining.  According to the article, there are three broad strategic categories within the IoT.  There are enablers, enhancers, and engagers.  Enablers generate and instate underlying technology; Engagers deliver IoT to customers through a streamlined one-stop shop; Enhancers concoct value-added services for engagers. These categories are trying to approach a very complicated problem with structure. Yet, structure implies a foundation—and in reading this article I came upon a cliché but logical question, “Is there an Internet of People (IoP)?”  My question, after a brief skim was, of course, answered.  Yes, the concept was used, but in a different manner than I had anticipated.  The victor to this vanguard saw the IoP as a shift in the way government and economic models operate under the IoT, and he approaches this question philosophically, The social and economic contracts between people, businesses and governments are undergoing a fundamental change and new rules of the game and governance models are needed for the future digital societies”.

    This same thought that had piqued my interest had also led another to the same conclusion: the IoP is the foundation of the IoT.  Sensors and data collectors respond to products, and products respond to people.  An example can be seen in current political revolutions: In 2010, Hong Kongians displayed disfavor against anti-democratic sentiment from Mainland China through texting, twitter, and Facebook (The Economist: Protest in Hong Kong). When China made such communication impossible in recent protests by disconnecting communication infrastructure; protesters turned to FireChat, a texting app that operates in short range without Wi-Fi.  The irony of this situation is simple: humans used democracy to support democracy through technology.  Most recently, an article about Iran by the Economist, in the most prescient sense, assessed Iran through saying: “The revolution is over.” Yet, in many ways, the revolution in Iran has just begun.  As the article reports, “So-called VPNtrepreneurs sell software and access codes to bypass controls. E.G. a 21-year-old, who resells software, says he charges a dollar a month or $10 a year to his 80,000 clients and he uses his day job at an IT company as a cover. And, occasionally he pays the cyber-police a few hundred dollars in bribes.” No longer can Ruhollah Khomeini rule as absolute dictator.  Power has been decentralized.  This lovely article was an honorable Facebook post of mine: The point being, that people are people and each person is different.  This is the irony of the IoT-for once mass production has turned to personalization.

    According to The Economist, is sensational because it sells over 230M products that are each accessible within seconds. Even the concept of Amazon appears decentralized.  It doesn’t matter where you are; it will be there quick and, in most cases, relatively cheap.  It is important to recognize that this decentralization has been underway for a long while.  The past two decades have proved that people can work from home without losing connection.  This is amazing.  I remember watching Austin Powers when I was younger and gapping at the videoconference scene between Dr. Evil and World Leaders.  I thought, “What if that was real?” In 2013, FaceTime mass-produced this reality as a Voice over IP (VoIP).  Soon Facebook, Gmail, and other messaging services also made this free.  Even though, Skype has been doing this for a decade.  I took advantage of these developments while I studied in London last year, and in many ways, I felt much closer to home that the “pond” used to suggest. So close, in fact, that I didn’t come home once during that period.  Connectivity, essentially, has changed the way that we communicate.  Price point no longer inhibits the amount of time we spend on long distance calls.  This increase in connectivity options has literally come from an increase in connectivity. The Economist recently posed an honest question on hotspots, “As Wi-Fi proliferates, who needs cellular wireless?” By 2018 the number of public hotspots is expected to increase from 47M to 340M.  The point being, that technology has changed human behavior.  It should be no surprise that communication companies make up the largest companies.  Technocracy, a term developed by Californian engineer, William Henry Smyth, in 1919, rules the world.  Jack Ma, Mukesh Ambani, Elon Musk, Steve Jobs, Bill Gates, Larry Ellison – these are the famous men of today, sorry Jamie Dimon but you’re a rare example, and they are famous because they are accentuating our most important human activity: communication.

    This is why the IoP, through the foundation of techno-communication, is being backed into by the IoT.  As we push against products and they push against us—An Internet of People is going to include the customization of not only our products but of ourselves.

    The post Internet of People: An Analysis appeared first on Making Data Meaningful.

    Making Data Meaningful

    Blobitty Blob Blob Blob….

    At my current client, we are working on a major rewrite of their Claims Management System, which has included moving the old data from an Oracle database to a new SQL database being designed by our team. Part of this conversion involved a particular challenge. We were tasked with extracting all of the documents stored in the Oracle DB as BLOBs (HUGEBLOBs to be exact) and loading them out to the file system. The documents are going to be stored in the new SQL database in a “File Table”. A File Table is essentially an index to these files, while the files themselves are not physically stored in the database, they are stored out in the file system. The File Table contains certain fields including a unique stream_id, a path_locator that indicates where the document is, and other pertinent information about it, including the file_stream encoding, file name, file type, file size and other attributes. There is no ability to add additional fields to the built-in SQL File Table structure for storing other key information.

    There are 1.2 million documents that have to be extracted, saved to the file system, and linked to the appropriate claimant’s record in the new database. Without being able to attach any other fields to these documents while they are extracted to the file system to link the document to the appropriate client record, that is challenge #1. Challenge # 2 is getting the documents out en masse (certainly we couldn’t save them out one by one using the existing front end application). Challenge #3 comes later, and I’ll get to that.

    So I did some searching and found a built-in Oracle function called UTL_FILE that extracts the documents from the Oracle BLOB table. I tweaked the function into a script that contains a cursor to select the documents in chunks at a time, then loops for each document, renames the document, and runs UTL_FILE to save it out—looping each file into further chunks when they are too big for one write (the limit is 32k bytes at a time). This script then writes them out to the file system on the network. This solved Challenge #2 which was to bulk extract all of these documents, although due to the time cost of the script, they still will need to be done in smaller batches at a time.

    Renaming the documents as they came out of the system solved Challenge #1, where these documents needed to somehow be linked to the original claimant’s record. I appended the original Claimant ID number to the front of the document name, separated by “—“ so it would be easy to use SUBSTRING to get the ID number out later for loading into the table that links the File Table records (by the stream_id) to the Claimant records.

    Challenge #3 came when trying to open some of the documents after they were exported. They were corrupted—it was approximately 10% of the documents. After a lot of head scratching and looking for similar patterns in these documents to explain why they were corrupted yielded no clear answer. A suggestion from my manager led me to explore the code from the old application. I looked at the code that uploaded documents into to the Oracle table, and the code that opened them and allowed the users to export and save them out to the file system. Therein lied the answer. It appeared that some of the documents were being compressed before being uploaded to the Oracle database. There really seemed no business rule for which were to be compressed, and there was no indicator in the database as to if they were compressed or not. Therefore, I altered my UTL_FILE script to uncompress all of the files before saving them out. Unfortunately, if a file was NOT compressed, it would throw an error. So, I again altered my script to catch the error, and NOT uncompress those documents. Voila, the script worked like a charm. Here it is in all its glory, and our customer is happy that we can get all their BLOBs out!

    The post Blobitty Blob Blob Blob…. appeared first on Making Data Meaningful.

    Revolution Analytics

    Video: R for AI, and the Not Hotdog workshop

    Earlier this year at the conference, I gave a short presentation, "The Case for R, for AI developers". I also presented an interactive workshop, using R and the Microsoft Cognitive Services...

    Ronald van Loon

    Digital Meets 5G; Shaping the CxO Agenda

    The age of technology is way past its nascent stage and has grown exponentially during the last decade. In my role, travelling to events and meeting with thought leaders I am aware of the developments made across different technological fronts. During these events I have had the opportunity to meet and greet some of the brightest marketing minds. One such person, who is playing an imperative role in ensuring the smooth run of things in both Ericsson and across the digital sphere, is Eva Hedfors.

    Eva Hedfors, who is the Head of Marketing and Communications at Ericsson Digital Services, is a leading driver in evolving the perception of Ericsson as a partner to operators transformation from Communication Service Providers to Digital Service providers. I had the opportunity to first meet Eva during Ericsson studio tour in Kista, Sweden. During our first meeting I could tell of her knowledgeable insights and the positive vision she had. Just recently I had the opportunity of interviewing her for a topic she will be presenting in a webinar on the 20th of June. The topic of the webinar – Digital Meets 5G; Shaping the CXO Agenda – is up for interpretation, and she did give me details regarding what she is expecting from the webinar, and how she plans to go about answering some of the questions in this regard.

    What Steps should be taken by CxO’s for a Smooth Transition to 5G?

    Eva shared her insights on how CxOs could prepare for a smooth transition to Digital Service Providers powered by 5G.”The initial 5G technology deployment will target the massive growth of video traffic in networks, but a leading and hard-to-crack requirement for all CxO’s is also to realize the industry potential and find new business growth through 5G. This involves to both innovate and participate in eco-systems, as well as to optimize the route for marketing such 5G services. CxO’s  can take advantage of 5G to address relatively new segments and industry use cases in mission critical IoT as well as Massive IoT.” Eva explained the business models one creates also needs to be up-to-date and should reflect what’s happening in the market. Since the plan for 5G is rather new, most companies and industries won’t know much about it. Hence, it is necessary that decision makers in Telcos to position their existing capabilities towards different industries and using Network Slicing is one way to do that already on 4G.  To capture the potential in 5G, for many CxO’s means focus to create a revamped strategy for billing and charging systems into a Digital Business Support System (BSS). Moreover, a proper infrastructure needs to be provided to ensure that the end consumer gets to experience the technology in a seamless manner. This would help generate positive insights. 5G is here today, and action needs to start from right now!

    How to Avoid the Challenges Involved in Digitization?

    The first step to avoiding the challenges involved in digitization is to recognize the efforts most customers have to put in place when engaging with their Telco provider. Once these efforts have been quantified, Telcos can take the necessary action. For the customers, touch points should be made accessible, and there should be no hindrance in communication for B2C, B2B and B2B2C customers. Failure to put the right digital IT infrastructure in place, including analytics and Digital BSS, will limit the business potential of 5G. That is why 5G and Digitalization needs to be planned and executed not as individual technology transformation projects, but as one transformation that aligns towards the same overall business objective in each time frame.  Moreover, the technology teams should be motivated to simplify the core network and make it programmable. Eva mentioned that it was imperative for organizations to start already now and simplify the journey from vEPC to 5G Core for proper implementation and monetization of these revamped services.

    Research for 5G Readiness across Different Industries

    When asked about the research done to analyze the 5G readiness across different industries, Eva mentioned that Ericsson has done several reports on the potential of 5G across industries.

    1. The 5G business potential study by Ericsson analyzes the business opportunities that come from proper industrial digitalization. The report focuses on the opportunities for organizations present in 10 of the major industries including, Manufacturing, Energy and Utilities, Automotive, Public Safety, Media and Entertainment, Healthcare, Financial Services, Public Transport, Agriculture and Retail. There are detailed use cases for these industries present in the research, which may help stakeholders in these industries to make a decision regarding 5G usage.
    2. Another research based study released by Ericsson in this regard is the guide to capturing 5G-IoT Business Potential. The study answers questions pertaining to the selection of industries and what use cases to address. The insights have been collected from over 200 5G use cases that illustrate how operators can start their IoT business now through the use of 5G.

    How Can 5G Technology Improve the Customer Experience Offered to existing Customers by Service Providers?

    Enhanced Mobile Broadband is one of the major benefits of 5G technology, according to Eva, and it will help service providers enhance the experience they offer to their customers, who continue to increase consuming video on mobile devises . Better performance, reliability and ultra-high speed are some of the examples of the broadening consumer experience that can be provided through the 5G experience. According to a recent ConsumerLab report conducted by Ericsson, more than 70 percent of all consumers identify performance as one of the major expectations they are looking forward to from 5G.


    What are the Preparations and Biggest Challenges for 5G Readiness?

    Through our industry partnerships we do know, many organizations across many industries have started to analyze how 5G will help drive their digital transformation. 5G business models are being crafted to ensure that the implementation is as smooth as possible. The biggest challenge to capturing the market potential for all actors in the industrial eco-systems, including telecom operators, is the investment in technology and business development. Business development will fall along the lines of organizational adaptation, and Eva believes that a proper infrastructure needs to be provided. It is necessary that 5G be provided the right infrastructure for industry wide implementation. Only organizations that have created the right structure and the model required for 5G implementation are ready for the technology. Without organization-wide infrastructure, 5G would be just like a car running without roads and filling stations.

    Integrating 5G Technology across Infrastructure

    Like we have talked about above, decision makers need to realize the importance of a proper automated structure that spans across all touch points to ensure that there is no hindrance to 5G services adoption. To that end, organizations also need to realize the importance of an architecture evolution strategy. The evolution strategy should seamlessly integrate 5G across the infrastructure and ensure the full flexibility in the handling of billing, charging and customer interaction.

    Both IoT and 5G technologies are shaping the digital transformation and transforming all digital architecture by helping organizations evolve their services and infrastructure. 5G particularly brings a new level of characteristics and performance to the mix, which will play an important role in the digitalization of numerous industries. Telecom operators leveraging the power of 5G technologies can gain from financial benefits as well, as a USD 619 billion revenue opportunity has been predicted for these operators in the future. This revenue opportunity is real and up for grabs by operators, but it does require business model development that elevates telecom operators beyond the connectivity play.

    For further insights in this regard, and what CxOs need to do for proper facilitation of Digitalization and 5G technology, you can head over to the webinar being hosted by Eva Hedfors and Irwin van Rijssen, Head of 5G Core Program Ericsson the 20th of June.


    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post Digital Meets 5G; Shaping the CxO Agenda appeared first on Ronald van Loons.

    Ronald van Loon

    How CDOs View Data Ethics: Corporate Conscience or More Regulations

    We really are in the midst of a data revolution. With a huge amount of data being generated every day, organizations in the current world are encircled left, right, and center by data and the analytic tools that are required for handling it. Leveraging this data has given companies unprecedented insights into different customer preferences and how they can cater to these needs. So, with all the emphasis on data and the capabilities it holds in the current world, should there be a question regarding data ethics?

    Recently, I got a chance to attend Sapphire by SAP. At the occasion, I was asked to moderate the International Society of Chief Data Officers event along with Franz Faerber, Executive Vice President at SAP, and Michael Servaes, Executive Director, International Society of Chief Data Officers. The event was graced by some very knowledgeable attendees, who shed light on the importance of data ethics, and what the way forward is.

    Speaking at the event, I talked about the different uses of data in place within the world today, and how that shapes our present and our future. Most of the companies today are not competing with their competitors anymore, but they are now up against the bar that has been set by their customers, the expectations that these customers now have of them. This bar has been set by the excellent service provided by companies such as Google and Facebook. Every touch point is important, and organizations need to realize this. Data analytics is the way forward. With all these advancements, there is a question that arises here; is the use of data going forward ethical? The role of the Chief Data Officer (CDO) here is to deliver a great customer experience by managing what they do with their data.

    How to define what’s Ethical?

    When it comes to data ethics, the first question that arises is about what’s ethical and what’s not. There seems to be confusion in this regard, and most organizations cannot reach a consensus on defining this. The panel that I witnessed in the CDO event had well defined answers to this question.

    The first and foremost step to defining the ethical virtue of any data set is to run the “sunshine rule.” This rule basically means how you would feel about your organization if the way you used your data was out in the open. By sunshine we mean if the data was out under the sun for everyone to see. If, when thinking about the answer to the question above, you feel that you wouldn’t mind if your use of data was to go out, then you wouldn’t have any reservations in terms of ethical use of data. But, if you feel that you wouldn’t be comfortable with your data being out in the open, then you might not be using your data in the most ethical manner. This is a litmus test that has been designed for getting answers based on the truth. Give the answer based on what you truly feel, and you’ll be able to tell whether your use of data is justified or not.

    Similarly, there are other tests as well that tend to cater to finding the ethical aspect of using data. These are all litmus tests. You can also imagine a scenario where your use of data is out in the newspaper, would you feel comfortable or threatened? Moreover, the panel members got on the lighter side of things and even mentioned imagining what you would feel if your significant other found out. Will you be able to stand as the same person, or would the guilt of the unethical use of data kill you? Having answers to all these questions can help you define your use of data as ethical or unethical.

    The role of the Chief Data Officer

    Chief Data Officers or CDOs are the leaders of the data functions in their organization. Not only this, but they play an important role in helping abide with the laws that regulation authorities have in this regard. The General Data Protection Regulation (GDPR) is one regulation that CDOs in Europe, but also in the rest of the world if you do business with European citizens, need to be well versed in. All panel members in the discussion agreed that the Chief Data Officer should keep their team in the loop at all times. It is always good to have the opinion of your team, rather than to go along with just your own opinion. If even one person in the team feels that the data is not being used in ethical ways, the CDO should be able to take the required steps to address this issue.

    Until very recently, CDOs were considered the new kids on the block, and many other C-level executives didn’t rank them at the same level as themselves. However, these rising stars of the digital business have now taken the center stage on the seat towards deciding ethical standards in any organization. One of the most important parts of CDO’s is to make fine judgment calls that don’t trespass the line between trust and innovation. It is important to realize the trust organizations give CDOs and then work on it to ensure that their trust is respected.

    CDOs sit at an important position, and it is their job to understand the ethical requirements of using data and eventually fulfilling those requirements. In the midst of advanced data analysis tools, it is important for CDOs to also realize the importance of giving customers the very best in terms of ethical standards, and customers deserve this trust.

    Impact from Algorithms

    While the world of data can be considered as impressive and transformative, there have been instances where algorithms have gone wrong. These instances have happened because of a wide variety of reasons including human biases, usage errors, technical flaws, and security vulnerabilities. For instance:

    • Many social media algorithms have gotten the wrath of viewers over how they influence public opinion. Just recently, we saw Google wrongfully accusing the views of the shooter behind the Los Angeles shooting. They later took the blame upon themselves, but these algorithms can dictate public opinion.
    • Moreover, back in 2016 during the Brexit referendum, we also saw how algorithms were blamed for being the reason behind the flash-crash that saw the pound fall by over six percent in value.
    • Moreover, investigations in the United States have also found out that algorithms in place within criminal justice systems have been biased against a certain racial group.

    Best Practices

    To effectively manage the ethical implications of data, CDOs should take the reins and adopt new and better approaches for building stronger foundations. There should be better algorithm management, and CDOs using data analysis tools should ensure that they take care of the ethical needs of the data they have with them. Only then would they be able to really use that data for something feasible.

    We used multiple voting at the event, and came to the conclusion that data ethics should be declared as a guideline on the corporate level. More than 90 percent of the CDOs in the audience thought that way, and with growing regulations in this regard, more ethical checks on data are more of a necessity than a want.

    About the Author

    Ronald van Loon is, Director at Adversitement, an Advisory Board Member and Big Data & Analytics course advisor for Simplilearn.

    If you would like to read more from Ronald van Loon on the possibilities of Big Data and the Internet of Things (IoT), please click “Follow” and connect on LinkedInTwitter and YouTube.


    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post How CDOs View Data Ethics: Corporate Conscience or More Regulations appeared first on Ronald van Loons.


    July 13, 2018

    Revolution Analytics

    Because it's Friday: Language and Thought

    Does the language we speak change the way we think? This TED talk by Lera Boroditsky looks at how language structures like gendered nouns, or the way directions are described, might shape they way...


    July 12, 2018

    Revolution Analytics

    New open data sets from Microsoft Research

    Microsoft has released a number of data sets produced by Microsoft Research and made them available for download at Microsoft Research Open Data. The Datasets in Microsoft Research Open Data are...


    Simplified Analytics

    How to address the reluctance for Digital Transformation?

    Digital Transformation is in full swing now and adopted by almost all the industries to improve the customer experience. But not everyone is sailing smooth. In fact, a majority of Digital...


    July 10, 2018

    Revolution Analytics

    In case you missed it: June 2018 roundup

    In case you missed them, here are some articles from June of particular interest to R users. An animated visualization of global migration, created in R by Guy Abel. My take on the question, Should...

    Ronald van Loon

    Strategies for Monetizing Data: 2018 and Beyond

    The data revolution is here, and it creates an investment priority for enterprises to stay competitive and drive new opportunities. One of the brightest areas is data monetization, which describes how to create economic benefits, either additional revenue streams or savings, utilizing insights provided by data resources. With B2B and B2C data needs reaching an all-time high, the monetization strategies now and into the future should be seamless for use across multiple platforms.

    To get an expert view on this matter, I recently tapped Jeremy Rader, Director of Data Centric Solutions at Intel. The Opportunity for Data Monetization

    Researchers have reported that the market size for big data is on the rise and is fast becoming an important distinction for organizations. This age of data means that the data culture for every organization needs to be revamped. Almost any company now has the potential to be a data company. In a research study conducted recently on big data and analytics, more than 85 percent of all respondents interviewed reported that their organizations had taken steps toward a data-driven culture. But, when asked if they had success in achieving that culture, only 37 percent replied in the affirmative.

    Positioning Your Organization for Success

    A key protagonist in this move toward a data culture is the Chief Data Officer (CDO), responsible for leading the figures behind data within an organization. But not every organization has a CDO, and for those organizations that do, it’s a new role with an evolving definition.

    The key role of the CDO should be to take a futuristic view of an organizations’ data model that includes a data monetization strategy. This Eckerson Report includes internal recommendations for data monetization, including delivering concrete data analytics to your employees so they can prioritize, make more informed decisions and reduce costs. There are also opportunities to enrich your existing products with the use of data analytics and customer retention models, and to create a whole new product line that generates revenue by selling your data products to customers.

    What is needed for Data Monetization Success?

    To start, companies must be able to glean timely, in-depth insights from their data. Those insights come from the ability to access, organize and interpret the data—in effect, taking a ‘whole business’ approach to analytics.

    A key focus area to help enterprises begin to align and organize around their data strategy is to get their data layer right. AI and advanced analytics workloads require massive volumes and types of datasets. To get your data ready to harness, break away from fragmented systems and older data storage models that keep your data trapped. Many organizations achieve this by implementing a modern data lake model. Then, tier your data based on its use. Your tiering strategy should include a storage model that matches your data tiers to reduce storage costs and optimize performance.

    Here are some other tactics organizations should consider:

    • Establish a Clear Vision: The company’s executives should share the vision of correctly monetizing data by allocating necessary resources, including time, workforce and investment toward execution.
    • Agile Multi-Disciplinary Teams: Data monetization can be done through agile multi-disciplinary teams of data architects, product managers, application developers, analytics specialists, and marketing and sales professionals.
    • Develop a healthy, competitive, data-driven culture: Unless communicated across an organization, data remains worthless. To extract the right information and insights from structured and unstructured data, it is important to focus your efforts on cultivating a data-driven culture that empowers employees with the resources and skills they need to leverage data and obtain the right information at the right time to make more accurate decisions.
    • Ensure Easy and Secure Access to Data: For data to be monetized, it not only needs to be voluminous in size and nature, but also clean, accessible and consistent.
    • Data management & advanced analytics: A digital data management platform is essential for integration and providing solutions which are elaborate and comprehensive. A proper enterprise data management platform should contain the five service layers: engagement, integration, development, data, and modern core IT, which are the key components of every digital business. Advanced analytics provides the eventual meaning to the data through summarizations, models, calculations and categorizations. Data is valuable once it is analyzed.
    • Storage: Increased storage efficiency is critical to ensure your data is available and can be analyzed. The faster the data can be accessed while processing, the shorter the time to results, and detailed and nuanced analysis within a given response time. Intel® Optane™ DC persistent memory is a new class of memory and storage technology that better optimizes workloads by moving and maintaining larger amounts of data closer to the processor and minimizing the higher latency of fetching data from system storage.
    • Processes & delivery: A continuous development process that customizes data and analytics to your target audience needs a delivery system that provides analytics up to an advanced end-user application.

    The Future of Data Strategies for Organizations Dealing With Large Data Volumes

    As an example of a successful data strategy in action, the business of healthcare has an abundance of data and opportunities that can help power more accurate diagnosis and improved patient care. The stakes are high in an industry where patient outcomes are impacted by quick, early detection and treatment.

    For example, Intel worked with a large health system that had an older data infrastructure with fragmented systems and data silos, which was impeding their ability to rapidly access, blend, and analyze data from multiple sources to deliver precision medicine, improve patient monitoring, and drive innovation in its’ healthcare practices. By deploying a modern data hub (Cloudera* Enterprise) running on Intel® Xeon® processors, this large health system was able to see significant results. They are using machine learning algorithms and predictive analytics at scale to anticipate and account for various patient outcomes by analyzing over 100 data points per patient per day for hundreds of thousands of patients.

    There will be obstacles along the journey to get your data to a place where it can be used to answer some of your biggest challenges, but those challenges can be overcome with the right focus and investment.

    The AI Revolution is Backed by Data

    Intel understands that the advanced analytics and AI revolution is backed by and powered by data. The data has to be constantly maintained to achieve the ultimate potential of advanced analytics and AI. As such, Intel is focused on leading the charge for open data exchanges and initiatives, easy-to-use tools, training to broaden the talent pool, and expanded access to intelligent technology.

    The data revolution will drive demand for advanced analytics and AI workloads, requiring optimized performance across compute, storage, networking and more. The recent advancements by Intel, as they usher in this paradigm shift, include the Intel® AVX-512, a workload accelerator, and the Intel® Xeon® Scalable processor based platform. Through optimized infrastructure, modern storage and data architecture, and a pathway to run complex and massively scalable analytic workloads in any environment as well as scale up and scale out with performance and agility, we can successfully enable the business of data from the edge to the cloud to the enterprise.

    For more information, visit

    About the Authors:


    Jeremy Rader, Director of Data Centric Solutions at Intel, is responsible for enabling business transformation by driving Analytics, AI and HPC solutions, while driving next generation silicon requirements. LinkedIn and Twitter.



    Ronald van Loon, Director at Adversitement a data & analytics consultancy firm, and Advisory Board Member & course advisor for leading professional certification training company Simplilearn.

    If you would like to read more from Ronald van Loon on the possibilities of Big Data and the Internet of Things (IoT), please click “Follow” and connect on YoutubeLinkedIn, and Twitter.


    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post Strategies for Monetizing Data: 2018 and Beyond appeared first on Ronald van Loons.


    July 09, 2018

    Revolution Analytics

    R 3.5.1 update now available

    Last week the R Core Team released the latest update to the R statistical data analysis environment, R version 3.5.1. This update (codenamed "Feather Spray" — a Peanuts reference) makes no...


    July 06, 2018

    Cloud Avenue Hadoop Tips

    What is DIGITAL?

    Very often we hear the word DIGITAL in the quarterly results of the different IT companies especially in India. The revenue from the DIGITAL business is compared with the traditional business. So, what is DIGITAL? There is no formal definition of DIGITAL, but has been loosely used by different companies as mentioned lately.

    But, here is the definition of DIGITAL in an interview at MoneyControl (here) by Rostow Ravanan, Mindtree CEO and MD. This is a bit vague, but the best I could get till now. The vagueness comes from the fact that it doesn't say what BETTER is. Does anyone see something missing? I see IOT missing. Lately I had been working on IOT and would be writing my opinion on where IOT stands as of now.

    Q: Digital is still a vague term in the minds of many. What does it mean for you?

    A: So let me go back a little bit and tell you what we define as digital. We define digital and we put that in our factsheet, whenever we declare results every quarter.

    In our definition of digital, we take one or two ways of defining it. To a business user, we definite digital from a business process perspective to say anything that allows my customer to connect to their customer better or anything that allows my customer to connect to their people better is one way of defining digital from a business process point of view.

    Or if you were to look at digital from a technology definition point of view, we say it is social, mobility, analytics, cloud, and e-commerce. From a technology point of view, that is how we define digital.

    July 03, 2018

    Ronald van Loon

    The Intelligent World: 5 Cases from Today

    With all the talk about the smart revolution, we are finally in the smart and intelligent world, and have numerous use cases around us. While these use cases point towards the greater use of smart resources to make the world around us intelligent, there are numerous trends that can be seen coming up during the recent past. Three of these trends are:

    1. Trend 1: The digital world is finally having a bigger presence in the physical world, and we are moving towards the intelligent world.
    2. Trend 2: Digital transformation seems to be the only way towards success in the industry. It is currently the need of the hour to make correct decisions for the existing industries, and those in the future.
    3. Trend 3: The ICT infrastructure is the cornerstone for all digital platforms, as it fully supports service platforms and enables the digital transformation within any industry.

    Keeping in mind these trends, one can gear up for the interesting future, as the smart world comes towards us. As part of my interests in this regard, I was recently invited by Huawei to the Huawei Booth at CEBIT, as part of their Key Opinion Leader program. For those who don’t know, Huawei is a major player in the move towards a better, more intelligent world. They have continuously invested 10 percent of their revenue from each year into driving innovation and research and development. Their R&D ranks at the 6th position of all companies across the world, and they have a large part to play in the intelligent world.

    As part of the Huawei booth at CEBIT, I got to see many of the latest innovations that inspired me. Here I would point out five innovations and cases, which I experienced at CEBIT. These innovations seem to be a major part of the intelligent world, and should be thoroughly studied, to form a greater understanding of the digital world.

    Smart Cities

    The concept of Smart Cities has become a big part of the move towards the digital world. Smart cities were conceptualized almost half a decade ago, and we have come a long way during this time period. With many use cases being presented, one can see the use of Infrastructure as a Service and Software as Service (IaaS and SaaS) offerings in smart cities.

    With the rapid development of cities across the world, it was just a matter of time before we saw a smarter method of city management roll out. Urbanization has been found to exert additional pressure on city management, governance, and industrial development. Due to this additional pressure we see economic protection, public safety, industrial and commercial subsystems, and people’s livelihood scattered in a mess. A smart city program redefines city management and uses ML (Machine Learning) and Internet of Things (IoT) technologies to augment humans in this integral part of urban management in today’s age. The use of analytics techniques helps integrate the core information about city operations.

    The smart city leads to effective and collaborate services from the government, intelligent infrastructures, coordinated city emergency responses, and visualized city operations. Through this interactive convergence, traditional siloed city management transforms into a more coordinated and integrated city governance. In China alone, we see more and more cities become part of this venture, as city administrators realize the benefits in smart city management. Smart city management has helped increase law-enforcement efficiency by 150 percent in the few use cases that we have seen. The system gathers information from maps, videos, and IoT to perform visualized multi-agency operations. However, administrators need to understand the fact that smart cities are focused on pure IoT more than anything else. The role of IoT in smart cities cannot be undermined, and the sooner city planners understand this fact, the sooner they will be close to administering smart cities across the world. Simply put, having an IoT platform in the city is the first step to creating or supporting a smart city.

    Smart Transportation

    Click here to see the video on Youtube

    With the addition of Smart Airports, we are surely moving towards smartness and intelligence in transportation. Airports are locations that have many moving parts, and with IoT, humans can increase predictive maintenance and ensure better management to improve safety and ensure comfort. A smart airport solution basically covers aspects such as visualized safety, visualized operational processes, airport IoT, and visualized services.

    During the last decade, we have seen a booming increase in the global civil aviation industry. The industry has witnessed a continuous increase in passengers traveling via air and the revenue generated by the industry. Recent statistics by the International Air Transport Association (IATA) suggest that the demand for air travel will grow by over seven percent in 2018, and it is expected that the total number of passengers will also soar up high. Keeping in mind this growth, it can safely be said that civil aviation is by far one of the most rapidly growing industries across the world.

    The concept behind smart airports aims to limit the downtime that aircrafts go through, and it is expected that the efficiency achieved through the implementation of the technology will help save passengers more than 2,300 hours every day. Moreover, airports can give a more feasible and time-effective service to passengers by saving them the hassle they normally go through from boarding to getting on the plane. A smart airport system leverages technologies such as Big Data, IoT, and biometric to help passengers get through all security checks by just presenting their faces. Moreover, visual surveillance would lead to better tracking without any imperative blind spots. The system may use the input from the visual surveillance to analyze sensor-triggered alarms. Alarms from sensors are often misinterpreted, which is why the system rechecks them with visual data before sending them forward. This reduces the false alarm rate by over 90 percent on average.

    Smart Retail

    Click to see the full interview on Twitter

    The retail sector has seriously been expecting a digital revolution since the past couple of years. However, since there were numerous challenges involved in the process, retailers shied away from the innovation. The challenges relate to the downtime in deployment and the implementation of electronic price tags.

    The idea behind smart retail helps to bring smart inventory management to the retail business and envisions giving retailers the feasibility they need in their business. A smart retail imitative will have the capabilities to integrate services, switches, and storage devices for supporting the mainstream visualization software.

    Moreover, theft and inventory loss is one of the most concerning facts for most store-owners, and a smart retail cloud solution will keep a track of all such instances and alert the requisite agency if such an anomaly does happen. With the cloud support behind their back, smart retail providers will be able to reach out to customers on a personal basis and give them a personalized experience based on their past history and other factors.

    Smart Supply Chain

    Click to see the full video on Twitter

    The smart supply chain endeavors to ensure optimal delivery patterns in the supply chain world of today. Supply chain is a growing industry that is embracing the changes of the digital transformation, and we will soon start seeing how much it is developing for the better. The industry disruption has happened, and the use of smart Internet resources has ensured a decrease in many of the errors that usually occur in supply chain. A smart supply chain system will take care of all the different anomalies in the process, and ensures that they are predicted through different insights, and action is taken to limit these problems.

    The supply chain process should be clear of all hindrances, and through proper tracking of all goods organizations can now have a smart eye on their delivery networks. A smart supply chain network would effortlessly track all items and ensure that all goods go through the network without any hindrances.

    Smart Manufacturing

    The smart factory, or the concept of Industrial 4.0, connects human beings with the intelligent system of robots to create unparalleled augmentation. A fully connected factory would be able to use smart methods of production to increase their production capabilities and limit the different wastes they usually go through.

    The use of smart manufacturing would generate real time analysis on the quality of the products from production to the market. The system will be able to identify parameters that have an impact on quality, and would be able to implement methods of automatic data collection. Manufacturers can use online cloud diagnosis to gather real-time and expert resources. Moreover, the machine learning engine will use the system placed inside it to quickly locate faults and present a solution for it. This efficient detection of faults can be helpful, as it saves a lot of downtime and ensures that manufacturers can focus on production rather than wasting time on errors.

    After having a look at the five use cases I’ve mentioned above, and the work Huawei is doing in this regard, one can tell that the intelligent world really is upon us. And, propelling this intelligent world of smart networks forward is none other than Huawei.


    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    More Posts - Website

    Follow Me:

    Author information

    Ronald helps data driven companies generating business value with best of breed solutions and a hands-on approach. He has been recognized as one of the top 10 global influencers by DataConomy for predictive analytics, and by Klout for Data Science, Big Data, Business Intelligence and Data Mining and is guest author on leading Big Data sites, is speaker/chairman/panel member on national and international webinars and events and runs a successful series of webinar on Big Data and on Digital Transformation. He has been active in the data (process) management domain for more than 18 years, has founded multiple companies and is now director at a Data Consultancy company, leader in Big Data & data process management solutions. Broad interest in big data, data science, predictive analytics, business intelligence, customer experience and data mining. Feel free to connect on Twitter or LinkedIn to stay up to date on success stories.

    The post The Intelligent World: 5 Cases from Today appeared first on Ronald van Loons.