Team Data Science

The Fix Is In

photo: JD Hancock

Big data are a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization.

So say the opening sentences of the “Big data” article in Wikipedia. The people primarily responsible for conquering those challenges are data scientists. Being a data scientist these days is rather like being a Renaissance person; one must possess knowledge of and competency in a wide variety of subjects related directly and indirectly to the fields of mathematics and science.

Fortunately, data science has a number of sub-specialties to share the load. Understanding–defining–who does what (capturing, curating, storing, searching, sharing, transferring, analyzing, visualizing, processing) and why they do it means companies building data science teams can intelligently choose the areas of specialization that will best serve their goals.

Five Roles You Need on Your Big Data Team

Of course there’s the data scientist, the coveted knight in shining armor who visualizes models and creates (and continuously optimizes) sophisticated algorithms to transform data into something useful. But she could not do her part to fulfill corporate expectations without the support of equally coveted

  • Data hygienists, who deal with the “dirty data” problems inherent in collecting data so the data is clean now and stays clean in future.
  • Data explorers, who burrow into all the data a company collects to determine what, if anything, can be done with it, including how data originally collected for a different reason might be repurposed.
  • Business solution architects, who structure and organize data so it’s properly updated and where it needs to be within the necessary timeframe of every query–a critical feature of today’s data science when queries are ‘answered’ in real-time.
  • Campaign experts, who, with an in-depth understanding of both the technology and marketing, can turn the knowledge derived from data into insight and then into advice.

Assembling a powerful data science team, whether that team is internal or third-party, is necessary to applying big data tools. However, success rests as heavily in the hands of the right corporate culture as it does in the right specialized people. The best solution? Welcome reevaluation, innovation, experimentation, and keep the focus on the end game.

Scarce and growing scarcer

Being able to use the knowledge derived from data, achieving the insights to which data can lead is the centerpiece of a marketer’s requirements in the big data era. But before you can acquire knowledge, you must understand the data itself and how its patterns fit together and suggest other patterns, how to work with it to produce useful, meaningful knowledge. Enter data scientists.

Employment opportunities for data scientists are growing. They will continue to grow, and some institutions are putting educational programs in place to help meet future demand. However, projections suggestion the demand for data scientists will soon exceed their availability. A compelling graphic synthesizes the problem.

Top skill set Requirements to be a Data Scientist

Data scientists aren’t data analysts. While the two roles may start with a grounding in scientific and mathematical skills, a data scientist is far more a “Renaissance individual who really wants to learn and bring change to an organization,” says Anjul Bhambhri of IMB. About a data scientist’s skill set, Mark van Rigmenam writes,

They need to have statistical, mathematical, predictive modelling as well as business strategy skills to build the algorithms necessary to ask the right questions and find the right answers. They also need to be able to communicate their findings, orally and visually. They need to understand how the products are developed and even more important, as big data touches the privacy of consumers, they need to have a set of ethical responsibilities.

Often, related fields of study pair with a breadth of programming, managing, processing and curating skills to shape the qualities of individuals who will guide a business’s effective use of data. Rigmenam suggests an ideal data scientist would have the following skills.

  • Strong written and verbal communication skills;
  • Being able to work in a fast-paced multidisciplinary environment as in a competitive landscape new data keeps flowing in rapidly and the world is constantly changing;
  • Having the ability to query databases and perform statistical analysis;
  • Being able to develop or program databases;
  • Being able to advice senior management in clear language about the implications of their work for the organisation;
  • Having an, at least basic, understanding of how a business and strategy works;
  • Being able to create examples, prototypes, demonstrations to help management better understand the work;
  • Having a good understanding of design and architecture principles;

We would add, while an effective data scientist requires latitude to consider and experiment (work autonomously), she must also be able to work cooperatively. Data scientists are members of teams that aren’t simply made up of senior leaders. There are plenty of other employees who work in the trenches with ideas about situations that require solutions and how solutions would fit into goals of other departments. Failures in cooperation and communication can lead to costly disasters.

Likely, few data scientists possess all the above qualities, so a business should prioritize the ones important to them.

In planning for apply new technologies, businesses must also plan for how they will apportion responsibilities for critical data science needs–through third-party applications or data-science-specific internal departments or perhaps, both. At present, we are gazing at the tip of the big-data, data-scientist iceberg. Demand for big data solutions is increasing. So is the demand for the innovators behind the solutions.

The Conversation: Customer Relations Creepiness

Evil monkey from the movie about the evil monkey that smiles awkwardlyForget the creepiness of Google knowing so much about you it can recommend a restaurant based on your eating-out patterns (woe to you and the little gem of a restaurant Google is not going to recommend). The real creepiness (maybe) is when customer relations staff can’t get the essentials of customer relations just right.

In the world of handy big data apps and more, we need to ask ourselves how a business introduces “situational awareness” to the ways it conducts interactions with customers.

When Digital Marketing Gets Too Creepy

photo by: scragz

My Kingdom for a Smoothly Run Conference

Today's "Underappreciated Technology of the Day"Right now, I am in Sweden, preparing to keynote a conference, thinking that the organizers are probably worrying about managing all the details. Every event organizer hopes their event will satisfy all their constituents but know how challenging that can be. They want to do their best to make the conference run more comfortably for the attendee and more efficiently for the hosting organization (while maximizing ROI) and speakers delivering the best content.

Planning and executing flawless events has never been an easy mission and lots of credit goes to those talented event organizers who pull it off regularly. However, what if it could be made easier? Today’s event organizers are flooded with more data than ever but with fewer resources to handle all these data sources.

On the other side of that coin, I speak at fifty-sixty conferences, give or take, every year. I know what poor conference experiences feel like. The experience a host serves up is critical, and how event planners use the data they collect can make or break the experience for attendee and organizer alike. Increasingly, big data strategies could help manage all the logistics associated with events—and they can do it in real-time, as the conference is taking place.

I was invited as one of the many “VIP influencers” to speak at IBM’s SmarterCommerce Summit (Glen Gilmore explains what this means). During and following the conference, many of us related our enthusiasm via social media methods of blogs and tweets. IBM neither asked nor required anyone to blog or tweet about the event, but many did. It was difficult not to share enthusiasm for the ways IBM understands how commerce can get smarter (see Bryan Kramer’s reactions).

People are doing many mind-blowing things with big data technologies, but IBM addressed a matter near and dear to my conferencing heart, and I would like to share my enthusiasm: I have never attended an event run more smoothly.

At the conference, Alliance Tech demonstrated to me how they accomplished this feat. They use RFID technology, embedding sensors in badges to help organizers equipped with iPads manage a variety of tasks:

Track the real-time behavior of people at conference trade booths to evaluate a range of key metrics to encourage more at-show sales and develop an intelligent show strategy:

  • Ascertain the number and quality of leads individual exhibitors are generating
  • Monitor the flow of individuals through the conference spaces as well as keep track of audience numbers and compare them with session evaluations to determine the popularity and value of individual speakers.
  • Evaluate through social media the opinions and comments of people involved in the event to learn customer reactions and preferences.

Information on what is happening in real time helps organizers do things like negotiate the number of breakfasts needed each morning given the attrition rate of participants. Are more chairs needed in a late afternoon session? Does partitioning need to be rearranged to increase room size? Was a person who is criticizing a speaker actually sitting in the session? This information and more is at staff’s fingertips the second they need it.

Social media tools can expand an event organizer’s understanding of what is happening then and there. To demonstrate how these applications fit in, IBM partnered with several companies to set up a social media command center. Whom did they monitor on the dashboard? Every participant.

Smarter Commerce 2013-05-21 (01)

OneQube [] (the left, white half of the dashboard), a company specializing in relationship management and engagement in Twitter chats and hashtags, displayed real-time information on all mentions and tweets for each participant. At one point, I ranked as a Jedi in the volume of real-time impressions; kind of cool since i hadn’t ever been a Jedi before. 🙂 That along side the Foreigner concert IBM put on, gave me real flashbacks to the late 1970s.

Smarter Commerce 2013-05-21 (03)

MutualMind [] (the right, black half of the dashboard), a social media monitoring and analytics company, set up an example of real-time analytics that can help businesses understand their customers in social context.

Smarter Commerce 2013-05-21 (02)

Eric Gore describes the specifics of MutualMind’s analytics.

In addition to its live feeds, MutualMind sent me a copy of “My Personal Profiler” document.

Smarter Commerce 2013-05-21 (MutualMind 01)

A number of bloggers who were at the conference have a hundred thousand+ followers. I am not one of those bloggers. But when it comes to the “klout score,” a measure of influence, I scored very high among the attendees, not because I have the biggest social reach, but because the people who follow me are skew very influential.

MutualMind also prepared a word cloud of my audience’s topics; this draws from the frequency of words they put in their Twitter profile to show who they are and what they are interested in. Many people’s profiles are filled with words related to “life” and “love” such as ‘coffee’ or ‘sailing.’ I was surprised to learn what my word audience topic cloud looked like.

Smarter Commerce 2013-05-21 (MutualMind 02)

Almost every word is indicative of my area of interest and occupation. This word cloud suggests my Twitter connections are in line with the audience IBM wanted to reach during this conference and demonstrates what others find as notable value in the content I shared.

Smarter Commerce 2013-05-21 (MutualMind 03)

From social media tools like these, event organizers can work with a level of business intelligence that hasn’t been possible before. Who is influencing whom? What are they saying that would help planners plan more effectively? What do participants want?

Logistics for a single event are complicated, but managing conference logistics as IBM is doing takes commerce to a whole new, smarter level. As someone who speaks at fifty-six conferences each year, I would be thrilled if all event planners were applying data in this way.

photo by: kevin dooley