What’s bigger than social media? Most would say nothing, considering the overwhelming statistics surrounding this buzzword of our generation. As of May 2, 2013, the social media giant Facebook boasted 1.11 billion users, and the website uploads 250 billion photos every day. Facebook is not alone the intertwined social web. Websites like Twitter, Google Plus, LinkedIn, Instagram, Pinterest, etc. are revolutionizing the way people interact. However, in Big Data: A Revolution That Will Transform How We Live, Work, and Think, Viktor Mayer-Schonberger and Kenneth Cukier make the case that big data will have a bigger impact in how we make decisions. Big data is data that scales to multiple petabytes of capacity, is created or collected, is stored, and is collaborative in real-time. In Big Data, Mayer and Cukier convincingly argue that big data is a “resource and tool” that provides profound insights in correlation, yet has inherent risk.
Mayer and Cukier reiterate the
importance big data has in correlation, not causation. This can be a struggle to digest considering
that the sciences have forever investigated the mystery of why, not just
what. They write, “Correlations let us
analyze a phenomenon not by shedding light on its inner workings but by
identifying a useful proxy for it.”
Knowing what, not why, is simply good enough. An example given is Target’s big-data driven
marketing strategy. As Mayer and Cukier
explain, Target and other retailers recognize that pregnancy is an important
time in establishing store and brand loyalty.
Future parents’ shopping strategies become more open and their
tendencies begin to change in the pregnancy process. Target marketers saw an opportunity to take
advantage of this with the use of their database. They established which customers signed up
for the baby-registry, and analyzed which products they bought leading up to
their registry. These would serve as
“markers,” as to when someone would become pregnant. Target found around two-dozen “markers,”
including unscented lotion and supplements like magnesium, calcium, and
zinc. With this information, Target
developed a “pregnancy predication” score of customers based upon their
purchase history at the store. Target
used this prediction score, with great precision, to send relevant coupons in
the certain stages of pregnancy. Just
like Target, more businesses are recognizing the importance of predictive
analytics. Big Data provides many instances of corporations taking advantage
of “tell-tale” signs which will predict a future event. This could be the “whirring” of a motor
before it breaks down, or specific symptoms patients complain of at hospitals
suggesting their rapid return. What are
most important about these signs are their suggestions to what, not why. It’s uncertain
to why pregnant women buy unscented lotion around the third week of pregnancy,
but it’s useful information to the Target Marketers!
As Big Data explains, Big data produces new dimensions of risk to the
public. The regulation of data is
especially debated in recent times with the controversial NSA public surveillance. The NSA is not the only government program
pressing forward to compile immense amounts of data on individuals. “In two recent developments, the Internal
Revenue Service has vastly expanded its data mining of private financial
transaction for tax enforcement.”1
This risk lies in the private sector as well. Netflix recently released 100 million rental
records to the public in hope that someone could improve their movie
recommendation system, offering one million dollars for an increase of 10
percent. These rentals were completely
anonymous to each user’s personal information.
Researchers at the University of Texas at Austin compared the data
against other public information, such as Internet Movie Database (IMDb). They found that just six obscure (outside the
top 500) movies could identify a Netflix customer 84 percent of the time. And if the time of the movie assessment is
known, the accuracy improves to 99 percent.
Even though the privacy of our movie rentals may not be that important
to us, this example speaks on a much larger scale. We can’t be lost among the data. The fact of “de-anonymization,” explained by Mayer
and Cukier, is the result of two reasons: we capture and combine more
data. Mayer and Cukier explain perfect
anonymization is simply impossible with big data. De-anonymization has massive implications
toward the way data can be regulated, making it much more difficult. Big data is already transforming many aspects
of our lives and ways of thinking, forcing us to reconsider basic principles on
how to encourage its growth and mitigate its potential for harm. Mayer and Cukier explain the risks involved
with data, as well as how the government should approach this new regulation. Even though big data use is complicated—and
at the forefront of technological development—I found their risk and regulation
analysis straightforward.
When big data is collected for
whatever reason, it has value beyond its primary use. So one cannot simply avoid the data risk
argument saying that every customer should sign a “notice and consent” form because
how can companies provide notice for a purpose that has yet to exist? Protecting privacy requires that big-data
users become more accountable for their actions. Mayer and Cukier also explain that new
institutions and professionals will need to emerge to interpret the complex
algorithms that underlie big-data findings, and to advocate for people who might
be harmed by them. As big data unveils
a system of increased accuracy, it temps to judge people on not what they have
done what whey they are predicted to do.
The concepts of big data in
correlation and risk are only two of the topics enclosed in Big Data. Mayer and Cukier explain in more detail the
different dimensions of big data, such as its messiness, value, and future
development. Their style was consistent throughout the book. They succinctly explained their arguments and
provided interesting anecdotes behind them.
Although some themes were repetitive, I found that the stories sustained
my excitement and interest. Big Data informs the reader of an
exciting frontier on all realms of society—everything deals with data, and now
on a much larger scale. Chris Lynch, the former CEO of big data
technology developer Vertica who sold his company to Hewlett-Packard, claims “Big
data is the foundation on which social media, mobility, cloud and online gaming
industries are being built.”2
1Satran, R. (2013, 07).
NSA data fight could signal privacy war. U.S.News & World
Report, , 1. Retrieved from http://search.proquest.com.ezproxy.library.wisc.edu/docview/1428291593?accountid=465
2Kovar, J. F. (2012).
Big data is the next big thing. CRN, (1320), 12-n/a. Retrieved
from http://search.proquest.com.ezproxy.library.wisc.edu/docview/1008956895?accountid=465
