Unsupervised Learning

In supervised learning, we are given data that tells us the correct answers. Using the tumor example again, data for supervised learning are either benign or malignant and looks like this.

supervised learning dataset for tumor

However, for unsupervised learning, we are given data that does not come with correct answers, so we don't know whether the tumor is benign or malignant, we just know about its features. It will look something like this.

unsupervised learning dataset for tumor

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables (another word for features).

We can derive this structure by clustering the data based on relationships among the variables in the data. In the above example, we can separate the data into two clusters.

clustering data

Google news utilizes clustering algorithm to group together stories written on the same topic. In social networks such as Facebook, clustering can be used to identify cohesive groups of friends based on your interaction with them. Large companies use clustering to group customers into market segments so they can better market to users based on their needs and preferences.

Another way of handling unsupervised learning is the "Cocktail Party Algorithm". Imagine you are at a cocktail party and there are two speakers speaking simultaneously, with two microphones recording both speakers at the same time. The Cocktail Party Algorithm will be able to separate out one speaker from the other. So you get a result of each speaker's individual speech.

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

Quiz


Of the following examples, which would you address using an unsupervised learning algorithm? (Check all that apply.)

  • [ ] Given email labeled as spam/not spam, learn a spam filter.
  • [x] Given a set of news articles found on the web, group them into sets of articles about the same stories.
  • [x] Given a database of customer data, automatically discover market segments and group customers into different market segments.
  • [ ] Given a dataset of patients diagnosed as either having diabetes or not, learn to classify new patients as having diabetes or not.

results matching ""

    No results matching ""