decor
 

Planet big data logo

Planet big data is an aggregator of blogs about big data, Hadoop, and related topics. We include posts by bloggers worldwide. Email us to have your blog included.

 

May 20, 2012


Derrick Harris

‘Can I help you?’: How LivePerson decides who’s worth the personal touch

Even if you haven't heard of LivePerson, chances are you've encountered one of its products while browsing online. It's the company behind those pop-up windows offering real-time chat with a...

...
 

May 18, 2012


Revolution Analytics

Because it's Friday: Game theory

Game Theory is the mathematical study of how agents in a system make choices for their actions, in light of the fact that other agents are also making competitive choices of their actions. As the...

...

Derrick Harris

The unsexy side of big data: 5 tools to manage your Hadoop cluster

It's neither easy nor glamorous -- data scientists get all the love -- but making sure your Hadoop cluster is properly configured and applications are running optimally is necessary, especially as...

...

Revolution Analytics

R is to SAS as Java is to COBOL

An interview with Revolution Analytics CEO Dave Rich was published this week by BeyeNetwork. During the interview, Dace was asked about how the statistical modeling platforms have changed over the...

...

Revolution Analytics

In Mexico, more marriages ending in divorce, and sooner

R user Diego Valle analyzed the rate of divorces in Mexican marriage since 1993 (the earliest date for which data are available) and found that not only have more marriages ended in divorce over...

...
 

May 17, 2012


Ricky Ho

Predictive Analytics: Data Preparation

As a continuation of my last post on predictive analytics, in this post I will focus in describing how to prepare data for the training the predictive model., I will cover how to perform necessary...

...

Forrester Blogs

What's Your Big Data Score?

If you think the term "Big Data" is wishy washy waste, then you are not alone. Many struggle to find a definition of Big Data that is anything more than awe inspiring hugeness. But, Big...

...

Revolution Analytics

Orbitz: R has become the data-mining tool of choice

Sameer Chopra, vice president of Advanced Analytics at Orbitz Worldwide, wrote recently in Analytics magazine about the changing landscape of processes, software and systems for statistical modelers....

...

Revolution Analytics

Where's Waldo? Image Analysis in R

R user Arthur Charpentier attempts to use the raster library and R functions to find Waldo in a "Where's Waldo" image: Sadly, it turned out that Waldo was a bit too tricky to spot using these...

...

Derrick Harris

Scoop: Google, Microsoft both targeting Amazon with new clouds

Google is hard at work on a cloud computing offering that will compete directly with the popular Amazon EC2 cloud, I have been told, although Microsoft probably will beat it to the punch. The timing...

...

Curt Monash

Thoughts on data science

Teradata is paying me to join a panel on “data science” in downtown Boston, Tuesday May 22, at 3:00 pm. A planning phone call led me to jot down a few notes on the subject, which...

...
 

May 16, 2012


Ricky Ho

Predictive Analytics: Overview and Data visualization

I plan to start a series of blog post on predictive analytics as there is an increasing demand on applying machine learning technique to analyze large amount of raw data.  This set of technique...

...

Derrick Harris

Calvin: A fast, cheap database that isn’t a database at all

Yale researchers Daniel Abadi and Alexander Thomson think they have developed the cure for Oracle and IBM dominance in the world of database performance, and it isn't even technically a database. The...

...

Revolution Analytics

Revolution Newsletter: May 2012

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full May edition (with highlights from this blog and community events) online. You can...

...

Omer Brandis

SAP Classification vs Enhancing Master Data Tables - part III

I've recently learned of a better way to select classification data without the need for many join's or subselect statements....

...

Derrick Harris

Does function trump form in application design?

There's a principle of application design that beautiful means usable, but a new study out of Google suggests that while beauty doesn't necessarily affect perceived usability, poor usability can...

...
 

May 15, 2012


Revolution Analytics

How long before R overtakes SAS and SPSS?

Based on an analysis of Google Scholar data on usage of statistical software, Bob Muenchen makes a forecast: R will overtake SAS and SPSS in 2015. Forecasting is extrapolation — always a tricky...

...

Derrick Harris

Your data has a secret, but you — yes, you — can make it talk

Who needs a Ph.D. in statistics when you have the cloud? Machine learning is high data science, and it's fast becoming something that anyone leverage to sell more handbags, or solve a research...

...
 

May 14, 2012


Derrick Harris

Thwarting terrorism with creativity and lots of data

There's nothing quite like a hypothetical about someone setting a whole block on fire after cutting off the fire department's electric supply in order to slow its response. Is it comforting to know...

...

Revolution Analytics

Multiple Sclerosis Tweet-Chat: Review

We had a great Twitter conversation last Thursday on the use of big-data analytics, Revolution R Enterprise, and IBM Netezza in the search for a cure for MS. Many thanks to the other panelists:...

...

Revolution Analytics

New courses from R gurus

Looking to learn R, or to expand your R skills for data visualization or package development? Here are some R courses presented by the experts you may be interested in: June 19-20: Visualization in R...

...

Derrick Harris

Yahoo’s big data play Genome is smart, but …

Yahoo is looking to leverage its big data prowess with a new tool for marketers called Genome. It looks like an acknowledgement that while Yahoo might not rule the the web anymore, it knows a heck of...

...

Too Much Information

Forthcoming Webinar: Real World Success from Big Data

The initial focus of ‘big data’ has been about its increasing volume, velocity and variety — the “three Vs” — with little mention of real world application. Now is the time to get down to business.

On Wednesday, May 30, at 9am PT I’ll be taking part in a webinar with Splunk to discuss real world successes with ‘big data’.

451 Research believes that in order to deliver value from ‘big data’, businesses need to look beyond the nature of the data and re-assess the technologies, processes and policies they use to engage with that data.

I will outline 451 Research’s ‘total data’ concept for delivering business value from ‘big data’, providing examples of how companies are seeking agile new data management technologies, business strategies and analytical approaches to turn the “three Vs” of data into actionable operational intelligence.

I’ll be joined by Sanjay Mehta, Vice President of Product Marketing at Splunk, which was founded specifically to focus on the opportunity of effectively getting value from massive and ever changing amounts of machine-generated data, one of the fastest growing and most complex segments of ‘big data’.

Sanjay will share big data achievements from three Splunk customers, Groupon, Intuit and CenturyLink. Using Splunk, these companies are turning massive volumes of unstructured and semi-structured machine data into powerful insights.

Register here.


Curt Monash

Notes on the analysis of large graphs

This post is part of a series on managing and analyzing graph data. Posts to date include: Graph data model basics Relationship analytics definition Relationship analytics applications Analysis of...

...

Curt Monash

We’re back

Our blogs have been moved to a new hosting company, and everything should be working. Ditto our business site. If you notice any counterexamples, please be so kind as to ping me.

...
 

May 13, 2012


Derrick Harris

Did Yahoo sow the seeds of its own demise with Hadoop?

As the world once again starts analyzing Yahoo's myriad woes after Sunday morning's ouster of embattled CEO Scott Thompson, I'm left wondering if its investment in Hadoop didn't aid in the company's...

...

Derrick Harris

Facebook’s delicate balance between profits and privacy

If Facebook really is overvalued leading up to its IPO, privacy might be the underlying cause of the company's missed expectations. As it turns out, pleasing both investors and users isn't easy for a...

...
 

May 12, 2012


Jeff Jonas

Self-Correcting False Positives/Negatives: Exonerate the Innocent

This blog entry is dedicated to false positives and false negatives, specifically why it is so essential that systems find and fix them. A false negative occurs when an assertion about something...

...
 

May 11, 2012


David Corrigan

Questions from the Market – “How can I reduce the cost of data?”

I’m starting an ongoing series that will be based on questions I’ve been asked when speaking at the Big Data and Information Governance forums.  The first question I’ll cover is “How can you control...

...

Revolution Analytics

Because it's Friday: Australian PSAs from the 80s

When I was a kid growing up in Australia, it seemed like every commercial break during the Saturday morning cartoon's or after-school shows was punctuated by some PSA encouraging us to lead a...

...

Big data Big analytics

Getting Into a Privacy Identity Innovation (pii2012) Frame of Mind: Will We See You There?

By Mary Ludloff As you all know, privacy is one of my favorite topics. And when you’re talking or blogging about privacy, it almost always comes back to personally identifiable information (pii)...

...

Revolution Analytics

Mariano Rivera’s baseball prowess, illustrated with R

Kevin Quealy, graphics editor at the New York Times, has published another fascinating behind-the-scenes look at how the Times creates data visualizations for print and online. In his latest post, he...

...

Revolution Analytics

On the language of Mad Men

Turns out that Megan would have never gotten a callback for an audition. (Via Ben Schmidt.)

...
 

May 10, 2012


Revolution Analytics

In case you missed it: April 2012 Roundup

In case you missed them, here are some articles from April of particular interest to R users. Information Age published a feature article on R, describing how new graduates are driving adoption of R...

...

Leon Katsnelson

Happy Mother’s Day to Big Data and Cloud Mamas

With the Mothers Day upon us and with many of my blog followers having mothers (my big data analytics software estimates this to be at close to 100%) or mothers themselves, I thought I’d post...

...

Revolution Analytics

EU court's SAS ruling conflicts with Oracle v Google

In a blow to SAS's efforts to litigate competitor and low-cost SAS clone WPS out of existence, the European Union High Court has ruled that programming languages can't be copyrighted. SAS Institute...

...

Derrick Harris

Preventing counterfeits with an iPhone and digital DNA

Applied DNA Sciences thinks it has created the perfect tool for identifying attempts to counterfeit or steal goods along the supply chain. It's mobile meets cloud computing meets big data, and it...

...

IM C3oC Team

Celebrate BigInsights 1.3.0.1 with a free course on Big Data University

IBM BigInsights is a distribution of Hadoop, a framework for massively parallel processing of unstructured data. It has just been updated to 1.3.0.1. You can easily deploy Hadoop on Amazon EC2 and...

...

Derrick Harris

EMC goes all-flash, buys XtremIO for $430M

EMC has bought Israeli flash-storage startup XtremIO for $430 million, according to Israeli news site Globes. The acquisition was expected after rumors began swirling in late April that EMC was...

...
 

May 09, 2012


Revolution Analytics

See R integrated with QlikView, Jaspersoft, Excel, and mobile apps

In yesterday's webinar, Revolution Analytics CTO David Champagne demonstrated how to integrate statistical graphics and analytic computations created using R software with a variety of third-party...

...

Derrick Harris

How Google is growing up into a real IT company

Of the dozens of meeting requests I received for this year's Interop conference, the one I least expected came from Google. Interop is all about enterprise IT -- networks, security, servers, stuff...

...

Derrick Harris

VMware: ‘The software-defined data center is coming’

VMware CTO Steve Herrod has a message for the IT world: "[S]pecialized software will replace specialized hardware throughout the data center." Via virtualizations and SDNs, software-defined data...

...

Curt Monash

Comments are briefly being turned off

I need to move web hosts, and am initiating the process now. This involves a large file copy, a recopy of same, and a variety of manual steps. So until the process is complete, updating site...

...

Derrick Harris

Is big data just a fad, or something much more profound?

It remains to be seen whether “big data” is the the flavor of the month or something more profound. What's indisputable is that more data is available than ever before and we now have the tools to...

...
 

May 08, 2012


Revolution Analytics

A sociologist converts from Stata to R

Ph.D candidate in sociology Ethan Fosse just switched from Stata to doing 100% of his analysis with R. His reasons? If you want to do Bayesian analysis or graph modeled coefficients (or work with...

...

Too Much Information

The Data Day, Today: May 8 2012

IBM acquires Vivisimo. Finding for Birst, ParAccel, Metamarkets and DataSift. And more.

An occasional series of data-related news, views and links posts on Too Much Information. You can also follow the series @thedataday.

* For 451 Research clients

# IBM picks up Vivisimo to search for value in ‘big data’ Deal Analysis

# Teradata delivers on analytic cloud vision with Active Data Warehouse Private Cloud Impact Report

# The Big Blue picture for ‘big data’ analytics: IBM sheds light on BigSheets Impact Report

# Oversight Systems’ Continuous Analysis extracts actionable insight from data Impact Report

# Kalido updates MDM offering with business users, operationalizing master data in mind Impact Report

# Delphix reaps reward from agile approach to database virtualization Impact Report

# Automated Insights looks to pitch narrative, visuals and stats to enterprises Impact Report

# myDIALS eyes indirect sales in quest to be Internet access layer for analytics Impact Report

* IBM Advances Big Data Analytics with Acquisition of Vivisimo Also announces support for Cloudera.

* Teradata Announces 2012 First Quarter Results Revenue up 21% (PDF)

* Actuate Reports First Quarter 2012 Financial Results Revenue up 9% (PDF)

* Birst Secures $26 Million in Financing Led By Sequoia Capital

* ParAccel Closes Record Q1 Revenues and $20 Million Investment Round

* Metamarkets Raises $15 Million to Deliver Data Science-as-a-Service

* DataSift adds $7.2M: The story so far and focus for the future

* Teradata to Acquire eCircle (PDF)

* Google BigQuery brings Big Data analytics to all businesses

* TIBCO Spotfire Brings the Power of Data Discovery to Big Data and Extreme Information

* Jaspersoft Teams with VMware To Deliver Business Intelligence for Data-Driven Cloud Applications

* Kalido and Teradata Sign Global Reseller Agreement

* Actuate Announces Cloudera Alliance to Support Apache Hadoop and BIRT Developers in Big Data Integration

* Hortonworks and Kognitio Announce Technical Partnership Driving Apache Hadoop Adoption in Big Data Analytics Implementations

* Tokutek and PalominoDB Partner to Bring Scale, Performance to Database Deployments

* Acunu is pleased to announce v2 of the Acunu Data Platform!

* Is Yahoo really threatening memcached and Open Compute?

* Introducing Zend DBi as a MySQL Replacement on IBM i

* Zettaset and Hyve Solutions Build First Fully Integrated Enterprise OS Hadoop Solution

* Cloudera Announces New Japanese Subsidiary

* Bull Announces the Formation of Database Migration Business Unit

* Couchbase to Run Native with Key-Value API for ioMemory

* The Big Data Value Continuum

* Big Data is Business Intelligence plus Attention Deficit Disorder

* Nokia released Dempsy an open source stream data processing platform.

And that’s the Data Day, today.


Revolution Analytics

R and Foursquare's recommendation engine

Foursquare, the mobile location-sharing app (of which I'm a big fan), has an excellent recommondation system. Based on your recent checkins, places your friends found popular, and even the time of...

...

Derrick Harris

Survey: Cloud’s hard, insecure and my boss made me do it

Many IT professionals find migrating corporate applications to the cloud a difficult, time-consuming and altogether painful task, according to results of a new survey from Cisco. While the majority...

...

Omer Brandis

Using array datatypes

It seems that postgresql and greenplum db have an array datatype.... So far so good, so what ??? do you wish to know what they are good for?

...
 

May 07, 2012


Derrick Harris

The big picture on Rackspace’s Q1: It’s becoming Mr. Hyde

Rackspace is the Dr. Jekyll of hosting. For the last few years, it has been a legacy managed hosting provider by day that dabbled in cloud computing at night. As Dr. Jekyll ultimately did, though,...

...

Forrester Blogs

ARM Arrives – Calxeda Shows Real Hardware Running Linux

I said last year that this would happen sometime in the first half of this year, but for some reason my colleagues and clients have kept asking me exactly when we would see a real ARM server running...

...

Revolution Analytics

Thursday: Tweet-chat on Multiple Sclerosis research

The story about the great work that SUNY Buffalo has been doing to find a cure for Multiple Sclerosis with Revolution R Enterprise and IBM Netezza has generated a lot of attention, with stories in...

...

Curt Monash

Site reliability has been ghastly

Unfortunately, we’ve had serious site outages over the past few days, as well as an increased frequency of shorter-term problems. My ordinarily excellent hosting company is going through a bad...

...
decor