Last week saw Mashable’s Social Good Summit taking place in New York. You can read summaries of the talks and discussions that took place on the Mashable website. As I was reading through reports of the talks I was particularly taken by a panel discussion on the balance between Internet freedom and big data. Security and personal freedom are a big concern in the world of mobiles. Issues of security are even raising their heads in the case of mhealth.

Big data

Image: Internet Connections Worldwide.

With the increasing digitalisation of the world, the data that is available from all aspects of people’s lives is growing. From health records to favourite music, the range of data now stored is mind-boggling. With mobile phone usage going up around the world the variety of ways of collecting data and the kinds of data that can be collected grow ever larger.

What is “big data”?

The term “big data” has been around for 12 years. Definition of it must take into account the various ways the term has changed in the way it is used and what people mean by it. The most basic definition is that “big data” is the term for a collection of data-sets.  Very large and complicated data-sets.

Timo Elliot has an excellent description of the big data phenomenon in which he breaks the term “big data” into seven definitions. My favourite of these is that big data is a metaphor, described by Rick Smolan in his book The Human Face of Big Data as “the process of helping the planet grow a nervous system, one in which we are just another, human, type of sensor”.

In the panel on big data and security the participants seemed to be using the term “big data” to mean the digitisation of information that is collected on people in their day-to-day lives. This kind of big data started when it became cheaper to keep data rather than throw it away.

What’s the big deal with big data?

This is a complicated issue. There are ways in which big data can be used for social good and ways in which it can cause harm. A key question is do the benefits outweigh the cost?

During a press conference following the social good panel Robert Kirkpatrick, the director of the U.N.’s Global Pulse said:

At the U.N. we see big data as the greatest opportunity for relief and development that the world has ever seen. Unless you fail to protect privacy and process, in which case it is the greatest threat to human rights.

Potential for harm

There is a popular meme going around the internet that says:

If the service is free, you’re not the customer – you’re the product.

This is a witty way of summing up a big problem with big data. When you use a service online that is free such as Twitter or Facebook the price you are really paying is your personal information. Though it must also be pointed out that just because you pay for a service it doesn’t mean that the company isn’t also using your personal information.

Big data

Target store. Image from WikiCommons.

The problem with this is that as we give over our information we have no idea how it will be used in the future. But once it’s given, it’s now stored forever in big data.

As more information gets stored we increasingly lose our personal freedoms. The recent NSA scandal is a good example of how far things can get out of hand with information mining.

By analysing big data, companies can even know things about ourselves that we do not know. For example, the American store Target sent coupons for money off vouchers for nappies and the like to a teenage girl based on her shopping record. They knew she was pregnant before her own family.

There is great potential for us to lose our privacy.

Potential for good

Now that it is cheaper to keep data than throw it away (digitally) there are huge quantities of information in the world. How can storing this information be bad? This is a huge resource and we now have the potential to understand people in a way and depth we didn’t before.

There are currently many innovative, exciting and potentially world changing ways in which big data is being analysed and used.

A group in Boston is using the increasing use of social media to try and prevent suicide. They have created a program that searches for keywords in social media that may indicate suicidal thoughts.

This project is still in the early stages of development but is a good example of how big data can be used for good.

Big data is also being used in the battle against cancer. Genetic data was correlated with responses to various cancer treatments to predict which one would work. This saved the brutal and ineffective “try-one-and-see” technique. This is an example of connecting two large datasets and getting insights not visible in each dataset on their own.

Big data is a great resource of information.

How to protect privacy

Global Pulse claims to have dealt with issues of privacy by having “privacy and data protection principles” that guide organisations in how to gather and use information. The key, they say, is that Global Pulse doesn’t collect any big data that contains any identifiable information. So, no names or addresses.

As increasing amounts of information is gathered, though, it seems a difficult task to make it entirely non-identifiable. Depending on your project it also will not always be helpful to gather no personal information. For mobile campaigns a phone number is essential, and having a name helps to make the messages you send more personable.

Conclusion

Humanitarian projects are more effective when organisations can get detailed and timely feedback on the progress of their efforts. Big data – and above all smart data – allows organisations to make better use of their scarce resources. The services that engageSPARK will offer will allow large and small organisations to easily and quickly gather the evidence that will help them learn and become even more effective.

The utilisation of big data has great potential for social good. But it is also open to great misuse. Hopefully projects will be able to walk the fine line between using this information to help the world and protecting personal freedom. The most important thing is open discussion of the potential and pit falls of big data.