Posted on

5 Myths of Big Data to Ignore

big data, data, data modeling, data science, data scientist



Big data is so popular and so talked about that people forget that it’s still a new field.  Digital data has been used and has been gathered for a little more than a decade; it’s useful for more than just big companies like Google or Amazon.  Every company can benefit from big data in some way as it aggregates the data.  Data exhaust and metadata are also created here and all three are integral for business analytics.  With all the information that’s out there, it still isn’t well known with more myths than facts being shared.  Here are five myths about Big Data, along with the debunking of them. 

5 Myths of Big Data  

Myth #1. Big Data is Only About Huge Amounts of Data – The three elements of Big Data are Variety, Velocity, and Volume.  How much data is gathered is the least important of the three.  Volume is the starting point, but it’s fluid and changes all the time.  The other two are better indicators to go from.  Variety deals with the different data types, be it files, video, social media posts, etc.  Velocity is the rate of change and how quickly it has to be used before it changes again.   

Myth #2. You Have to Use Hadoop for Big Data – Hadoop is open-source software from Apache to use with Big Data.   Consequently, Big Data is too big and too varied to be found, sorted and defined.  You really can’t just use one program, it takes many programs.  Hadoop is one of three different classes to work with Big Data.  The other two are NoSQL and Massively Parallel Processing (MPP).  On top of that, not all Hadoop components are built to work with Big Data and can be replaced with something that’ll work better.   

Myth #3. It Means Unstructured Data – A better term for it is multi-structured as there are so many forms and types of data that can be gathered.  Data models are built when the data is going to be used.   

Myth #4. It’s for Social Media Feeds and Sentiment Analysis Only – It does do this, but it isn’t the only thing that Big Data analyzes for you.  Any type of data is possible for analysis through using Big Data.  Don’t restrict yourself from its full potential. 

Myth #5. NoSQL means not only SQL – These types of data stores offer different ways to find and sort the data.  Technologies here includes key-value stores, document-oriented databases, graph databases, and big table structures to name a few.  SQL access can use many tools to complete its work.    


Posted on

Big Data Classification and Architecture

classification, big data, analyze, data analysis, data modeling

Source:  IBM developerWorks


Big data classification is not a new concept.  It’s been around for a while, people just didn’t realize it for the most part. What changed is that now companies know what it is and use the results to find new clients.  The question then becomes, how is the architecture created to handle all the different information out there?  First, you have to classify the data, then look for the right architecture that you’ll need to use.  Classifications are actually big data problems that your company has.  That’s why you need to classify the data first, then decide on what kind of architecture you’re going to need to fix the problem. 

Big Data Classifications 

 Utilities:  Predicting Power ConsumptionData creation completed by machines.  Uses smart meters to measure consumption and power grids.  Big Data solutions need the ability to analyze supply and demand. 

Telecommunications:  Customer Churn – Data creation done by reviewing social media, Web and transaction data.  You need to create detailed churn models to keep up with the competition.  Big Data solutions can help by using predictive analytics. 

Marketing: Sentiment Analysis – Completing data gathering through using social media and the Web.  Sentiment and profile data needs integration to find any useful results. 

Customer Service: Call Monitoring – Human generation creates the data needed here.  IT departments need the ability to analyze application logs in standardized formats to create Big Data files.   

FSS/Healthcare: Detecting Fraud –  Creating data files here is through using machine, transaction and human generation.  This needs real-time or near real-time monitoring to be effective.  There’s no other way to react quickly if there are reporting of unusual activity.   

Make sure that when you’re classifying the data that you look for characteristics.  This will help you to figure out what kind of architecture you’re going to need to create a Big Data platform for your agency.    

Big Data Architectures 

Analysis Type – Is analysis completed real-time or saved for later?  Consequently, choices here will decide types of tools, hardware and data sources to name a few.  Because of this, it can affect how you want to analyze the data.    

Choosing a Processing Methodology – This is where you’ll choose the techniques that will be used to process the data.  Your business requirements will decide which technique is going to be used.  Choose wisely.  It can be either predictive or analytical models, ad-hoc query or report building.  Which do you need more for your business?   

Data Frequency and Size – Knowing these two things will help you to decide which storage unit, formats, and tools you need.   Consequently, the frequency and size depend on data types also.   

The types of data that need gathering, plus content formats which are key to choosing which tools and techniques are going to be used are extremely important.  Considering what the source is of the data is important too.   

Data Consumers

Business processes/users 


People in different business roles 

Process Flows 

Different data repositories and/or applications 

Hardware – Make sure to choose the right hardware. Because you understand limitations of any hardware that you decide to buy to support this, it so helps you decide on the solution for the company. 


A lot goes into choosing which big data architecture is best for your company.  Don’t rush into it, take your time when figuring this out as it will cost money.  Think about which area you need the most help with, then decide on which architecture you’ll use.  It’ll help in the long run to save money and gain more customers.    



Posted on

10 Privacy Problems of Big Data Analytics

privacy, data security, security, data

Source: Rebecca Herold


With all the data leaking, hacking episodes, and finding out that some data agencies use people’s emotions against them to change their mind on a subject pertaining to elections, it’s no surprise that data privacy is such a concern.  I know it’s a big concern of mine.  I don’t want my information to be abused or sold to third agencies to use as they will.  Who wants that?  Tonight, I decided to share 10 big privacy concerns we all have to look out for. 

10 Privacy Problems

  1. Privacy Breaches and Embarrassments –  We all know about the problem Facebook has now due to this.  Instead, let’s look at how companies might use data that they captured on their sites of pregnant women who didn’t tell their families.  Said family finds out through flyers in the mail, the wife is then embarrassed as she didn’t want to tell anyone yet.  These situations happen a lot.
  2. The Possible Impossibility of Remaining Anonymous –  With so much data being shared and the powerful analytics being used to decipher it, makes sharing information anonymously almost impossible.  The customer needs to be able to have a way to make rules on how to use anonymous data.   
  3. Masking Data Might Become ObsoleteData masking needs to be done correctly.  If not, it’s very possible that Big Data Analytics might just break open the mask set in place.  You must set up policies, procedures, and processes effectively in order for the user to keep their masks.   
  4. Unethical Use Based on Interpretation – This is the big one on everyone’s radar today.  Trying to influence behaviors and decision-making processes is a super huge threat presently.
  5. Big Data Analysis isn’t Always 100% Accurate –  The data gathered isn’t always on point, which means results aren’t always going to be right.  This is a problem.  It could also be flawed algorithms or using incorrect data models.  The more complex the algorithm or data set, the more chances for mistakes to occur.  People, in turn, can be denied services, be falsely accused of something or even be misdiagnosed as an illness that they don’t have.   

If That Isn’t Enough…

  1. Discriminating People – Analytics has to be totally objective here or it’s very possible that people can be turned down for opportunities.  This includes promotions, hiring job candidates, and getting loans.   
  2. A Gap in Laws to Protect Involved People –  This one pertains to today especially.  It’s amazing how it always takes a situation before companies have to look at their business models to see how they can better protect their users.  Most still only tell the user about privacy risks thinking that it’s enough, but it isn’t enough, not anymore.   
  3. It’s Most Likely that Big Data Will Last Forever – There’s not much indication that any company will ever delete all the data that they’ve gathered about customers over time.  It’s too valuable for them to consider giving up.  So, the repositories just keep on growing as the insights are invaluable.    
  4. E-discovery Problems – Companies, for the most part, have to provide paperwork for litigation.  Now that most all documents are stored in repositories, analytics has to be used through what’s called predictive coding.  This helps to find and review papers needed in the litigation.  The concern here is that the code might be faulty, not finding all the data and documents for litigation proceedings. 
  5. Patents and Copyrights Might Become Obsolete – The big concern here is that when patents are submitted the patent offices might have a hard time determining if the patent is unique or not.  There’s so much data that it might just be too difficult to verify.  This is affecting copyrights also as it’s so hard to control information, which in turn affects royalties too. 


Don’t get me wrong.  Big Data is great for business and can really help improve processes.  A lot of technology upgrades and inventions are occurring every day in order to keep up with all the new trends.  What needs considering is the 10 problems above to make your business a leader in data privacy along with many others.   New accountability policies and procedures need to be created to cover all these changes.  Make sure to implement privacy and security controls before putting anything to use.  Keep vigilant and don’t do what other companies have done in the past, by either ignoring the problems brought up or brushing them under the rug.  That’s my suggestion.  Customers will like you a lot more for it. 

Posted on

4 Best Practices for Big Data Privacy

big data, data, data privacy, privacy, security, data security, cloud, cloud storage

Source: TechTarget


Big data is becoming a more popular method of gathering data for business purposes.  It seems like it isn’t just for storing data anymore.  As a result, more companies are using the data to gather useful information via business events.  This can be anything from reviewing contracts to finding new ways to entice potential customers to your store.  Because of this it doesn’t have the old way of doing things like passing information from the company server to data storage.  Consequently, it uses virtualization architecture to draw from large content stores and archives; as a result of finding this information, it becomes a global resource.  In turn this allows for better forecasting and predictions that might actually work. 

Sources of Privacy Concerns 

  • Quality and Accuracy of Data – How will it possibly negatively affect people in decisions being made?  How does the Internet affect data through possible bad Internet searches?  Is it possible that the scientist looking up the information might be using unverified information without realizing it? 

Best Practices in Big Data Privacy 

  1. Developing High Competency – You need to become extremely proficient in finding, buying and managing cloud services which are considered an intragyral part of big data for keeping costs down.    There are also companies that prefer not to make the investment and in its place use cloud-based applications, infrastructure, and processing power.  Anyways around it, to ensure privacy there has to be constant monitoring and audits of cloud services that your company is using.  Checking on data integrity, confidentiality and availability are all a must. 
  2. Implementing Converged Storage – It’s much more efficient and reduces possible errors.  Because of this, it increases data quality and accuracy.  There’s going to be a reducing of duplicate data being stored in the same locations and increase cost efficiency too. 
  3. Properly Sanitizing Data –  Make sure to analyze, filter, join, diagnose data at the earliest possible touch points.  It’ll make work much easier without having to go back fixing errors while saving you money in the long run. 
  4. Encourage and Invite – Make some sort of process for consumers to be able to gain access to, review and correct information already collected on them, being at no cost and user-friendly.  Ensure finding privacy policies are easy to reach.  Most of all, make sure to have an easy way for people to contact you with questions or concerns that they have.   Transparency and ease of access to be able to talk to you is key. 


Asking for the consent of gathering information is not enough now.   In conclusion, there’s so much gathering of data from others that it isn’t really a question to ask.  More on point is something like telling customers how they can restrict the use of their information or delete it.  Consequently, it’s not something that all companies would offer to their customers, therefore you should try it.  This is something that most likely is going to become a requirement for companies to tell customers in the future.  It seems that enabling privacy using best practices is going to be your best bet.  Most noteworthy it will help to increase the levels of trust and transparency that you and your customers will have in the long run, while saving money at the same time. 

Posted on

5 ways companies are using big data to help their customers (via VentureBeat)

big data, enterprise data, analytics, data analytics, data modeling, data science, data modeling, data model, data, data science, data scientist, data management,

Five ways companies are using big data to treat customers more like individuals — and build better long-term relationships so those customers happily buy more and more

Source: VentureBeat


As we all remember, back in the day you could go to the store and the clerk would know you personally.  They would ask you how you are and how your family is. It was a very personal relationship you would have, therefore creating loyalty between you and the store.  It has been lost for a while when stores started to sell online.  There were no programs to make your shopping experience more personal or enjoyable.  You just went online to search and buy.  Big data helps to build relationships again as it can help companies offer better service to customers if used.   Here are the five ways that big data helps online stores to treat their customers more like people instead of just numbers.

5 Methods to Use Big Data

  1. Prediction – Big data can help analyze past behaviors of customers to build a more personalized experience for them. This in turn creates satisfaction for the person and increases purchases.
  2. Excitement – This is more for wearable technology. FitBit and other companies spew out the data they gather to their clients, which makes the client more interested and excited to see improvements.  This is completed in other industries too, not just the health industry.  There are apps to help track finances too and make people excited to invest more.  Showing the data makes the client happier.  It can show them where they need to work to improve themselves too.  It’s a good tool for the customer to use.
  3. Improvement – Customer service is just as important as effective marketing and product development. Big data can help in all these areas too.  Representatives can answer questions more quickly and effectively when the correct data is in front of them.  This way the customer doesn’t feel like they are being badgered.  The data helps as the customer has so many ways to get a hold of companies now than before.
  4. Identify – Find the difficulties customers are having to improve their experience. It’ll make for happier and more loyal customers.
  5. Reduce – This deals with the health care industry for improving quality of patient care. It helps to cut cost and improve treatments.


Big data helps companies now to understand their customers better.  This helps agencies give better services and build relationships again, in a more modern way.  Just consider all the possibilities.  I would think about switching over myself if I had a bigger company and could afford it.

Posted on

84% Of Enterprises See Big Data Analytics Changing Their Industries’ Competitive Landscapes In The Next Year (via Forbes)

big data, data science, big data analytics, analytics, data modeling, data management, smart data, data mining

87% of enterprises believe Big Data analytics will redefine the competitive landscape of their industries within the next three years. 89% believe that companies that do not adopt a Big Data analytics strategy in the next year risk losing market share and momentum. These and other key findings are from an […]

Source: Forbes

I just thought that I would share this article.  It has some great statistics on why Big Data is now considered essential for any type of competitive growth.  For example, only 13% use Big Data analytics in predictive modeling, while only 16% are using the information that they find to improve processes.  If you were to use Big Data analytics, image what kind of growth your business could have…

I love studies as they always show the numbers to help strengthen their arguments.  Just wanted to share this with you all.

Posted on

Steps to create Data Model (via

Big Data, big data, data modeling, data, data science, data scientist, data management, analysis, data analyzing, technology, tech


Review of Article

These is general guidance for creating standard data models.  I’m not going to include all the steps as it’s over 24 separate steps, but depending on what your business requires you might not need to have all the steps anyways.  The link to the article is above if you’d like to see the entire list of what you can include.

Steps for Building Logical Data Models

  1. Gather up the business requirements
  2. Analyze business requirements
  3. Select target database – the data modeling tool will build the scripts to create reports
  4. Assign data type to attributes created to find data
  5. When analysis complete create columns to sort data
  6. Build subject areas to add the data
  7. Validate data model
  8. Create reports

Steps for Building Physical Data Models

  1. Get the logical data model and build a physical one from it
  2. Add properties to sort data
  3. Create SQL scripts
  4. Compare the database from the data model
  5. Create change log to document changes that have occurred


Posted on

10 Key Big Data Trends That Drove 2017 (via Datanami)

hadoop, big data trends, big data, analytics, data modeling, data science, machine learning, ai, deep learing

2017 has come and (almost) gone. It was a memorable year, to be sure, with plenty of drama and unexpected happenings in terms of the technology, the players, and the application of big data and data science. As we gear up for 2018, we think it’s worth taking some time to ponder about what happened in 2017 and put…

Source: Datanami

10 Big Data Trends

  1. The re-emergence of AI, deep learning and machine learning
  2. Hadoop becomes less popular
  3. Graph databases grow in use
  4. Apache Spark is keeping up with the competition
  5. The Cloud is super popular for storing Big Data
  6. Big Data fabric bypasses integration problems
  7. Big Data swamps are becoming a problem with too much data saved
  8. Big Data company IPOs are becoming popular
  9. Data Science platforms and vendor choices are growing
  10. Look out for GDPR (General Data Protection Regulation) that goes into effect on May 25, 2018
Posted on

De-mystifying the Big Data Business Model Maturity Index – (via InFocus Blog | Dell EMC Services)

Big Data, ai, deep learning, machine learning, analytics, business modeling, data modeling, data science data scientists, W3C,

Bill Schmarzo illustrates each stage of the big data maturity journey, with the new Big Data Business Model Maturity Index (BDBMMI) infographic

Source: InFocus Blog | Dell EMC Services


This is such a helpful article as it goes through all the stages of Big Data maturity in your business.  There are five stages that companies go through to reach maturity.

5 Stages to Maturity

  1. Business Monitoring – Most companies get stuck here.  Implementation of Business Intelligence optimization is a constant and they think that is enough.  In order to move to Big Data there are steps considered, the biggest being the use of data analytics like data mining, machine learning, AI, and blockchain.
  2. Business Insights – Predictive analytics sorts out all the information being gathered through transaction/operation data,  internal unstructured data like emails and customer comments.  Also gathered is publicly gathered data like social media, tax records and home values for data that might be beneficial for your company.
  3. Business Optimization – As for prescriptive analytics it helps to make recommendations for the business and for customers.  This helps to improve business performance and aims the company in the right direction.
  4. Insight Monetization – This is where your company will leverage insights gathered from all the data gathered.
  5. Business Metamorphosis – Your company will change here to adapt to all the new insights gathered.  In turn it makes your company a lot more mobile and flexible to change, therefore giving it much more of a competitive advantage.


Through using data science your company can become much more flexible to change, and can help it grow.  The company has to optimize key business processes, improve customer experiences, and create new revenue opportunities for the sake of taking advantage.  Just make sure that the systems that you have in place can handle so much data, as it can become a problem if it isn’t.

Posted on

What a Big-Data Business Model Looks Like (via Harvard Business Review)

big data, data, data science, data scientist, business, business intelligence, BI

There are three main ways to profit from the data revolution.

Source: Harvard Business Review


Interesting method of creating a business model by using big data.  This article is about three models that are becoming more popular than others and how they are going about the process.  One is through using results to create differentiating offers than others in your industry.  Another brokers the information gathered.  The third most popular method found is building networks to deliver data where and when needed.

The 3 Methods Broken Down

  1. Information Differentiation – Offer new services, customer satisfaction, give contextual relevance
  2. Information Brokering – Sell raw data, offer benchmarking, provide analysis and insight
  3. Information Delivery Networks – Support marketplaces, deal making, advertising


There are many ways to profit from using data as a business model.  But you have to choose wisely as to which model you would prefer your business to use and follow.  Take your time to decide, as using big data can really help you to get a head start compared to your competition.