Bridging the Insight Gap: Leveraging Machine Learning in Your Organisation

Digital Architect AI Bridging the Insight Gap: Leveraging Machine Learning in Your Organisation

AI

Bridging the Insight Gap: Leveraging Machine Learning in Your Organisation

Posted By Phillip Higgins

Today, many institutions have core machine learning use cases well met by utilising advanced statistical packages in the areas of risk and fraud detection, recommendation and cross-sell, segmentation, optimisation, computational advertising and many other areas.  Many other organisations though, are faced with little capability in this area and feel a need to develop in the area of predictive analytics and machine learning for competitive advantage.  The dual edged sword of increasing volumes and velocities of data generated and what is termed the insight gap – the difference in the ability of your organisation to meet the maximum possible return from information assets – make existing toolsets and skillsets irrelevant as requirements for information move from the traditional reporting and what-if scenario analysis to predictive, machine-learning based paradigms.

Fortunately a number of new tools have emerged that can really impact the adoption of predictive methods – including machine-learning based methods – that you should know about.  Certainly two open source libraries stand out in this regard: the R library for statistical computing and the machine-learning library for Hadoop known as Mahout. Both have industrial strength applications and are widely distributed.

R was originally developed by the University of Auckland and offers simple download  and installation for PC, Mac and Linux.  It offers both processing and plotting and visualisation capabilities and is a command line driven interface with graphical output.  R offers a powerful, functional language for statistical computations with a huge breadth of application. Since R is so well distributed and is well supported by both the open source and the academic communities, it is a safe bet if you are after a desktop tool that can connect to a wide variety of sources.

There are a number of tools that make R easier to use: RStudio, R Commander, Rattle are some well known ones.  Also Microsoft has enabled R from its Visual Studio IDE.

Mahout is a java library from the apache foundation, also available for download, that is designed to run against hadoop clusters and so utilises the Map-Reduce paradigm for its implementation.  Mahout offers machine learning scalable to “reasonably large datasets” in major areas such as classification, recommenders and clustering.  Mahout is currently at 0.8 but has a number of well known applications and is well positioned as the machine learning library for distributed computing and Big Data.

(First published July 24, 2013)

Tagged ,

Leave a Reply

Your email address will not be published. Required fields are marked *