Loading…
smsociety15 has ended
Wednesday, July 29 • 13:31 - 15:00
"Predicting Users’ Demographic Attributes from Browsing Data Analysis"

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Authors: Yu-Chin Liu, In-Ya Lee and Yi-Hsuan Chiang

As the mobile devices prevail, people all over the wold spend more and more times on internet surfing. However, due to the anonymity in the internet world, business users are hard to identify the real demographic data from potential customers. For many companies, customers’ demographic data are essential to create product segmentations and new business lines. 
To identifying customers’ segmentations has been a crucial job in making corporations more competitive and profitable. From the perspective of e-commerce strategies, delivering right products to right customers are of no doubt the first priority especially in the advent of the rapid changing business world. Traditionally, targeting customers based on the demographic data is the most common approach; however the anonymity of internet surfing makes such tasks difficult to achieve since e-commerce not being done face-to-face. Fortunately, the digital world provides excellent platforms to collect users’ browsing behavior which is hard to be done before. Such collected data bring not only BIG DATA but also great business opportunities. 

One of the most Big data well-known applications is to analyze business data for having insights into customers’ world; however while every company sees the same analytics or applies the same data mining techniques; no extra benefits could be gained. Therefore, new data analyze methods and tools should be invented to slice and dice corporate competitive advantages. Hence, in order to research new analysis tools, in our work, methods to discover customers’ demographic data are devised to help practitioners making better product segmentations. 

Objective: 
The main goal that we would like to achieve is to build the classification models to predict anonymous browsers’ demographic attributes. As stated previously, such information are essential to perform market segmentations. Ultimately, we expect the research results would help business people grabbing great business opportunities of their own. 

Methods: 
There are 1582 panel members whose browsing behaviour is logged under their approval. The panelists are required to provide their accurate demographic data and then download and install the NetRover™ software to record their every on-line browsing track for one-month. The collected data are therefore used to research our proposed models. 
Three classification models are proposed to predict the demographic attributes. The first one is built based on the websites being visited. However, since getting focus on every single website results in too many inputs in building models, therefore the websites’ categories defined by the comScore company are substituted to be the input attributes. For every category in the comScore, we further divide them into three measures: width, length and depth. 
The second model is to use the frequency counts of every comScore category as well as other browsing style characteristics defined by our work to predict the demographic class. The browsing style characteristics consist of the average categories visited per session, the average web pages visited per category, the visiting counts of the general top 5 categories and the richness of all categories. 

The last but not least, we consider the categories visiting sequences between different demographic attribute values and then put the frequent sequences into the classification models as input attributes. 

Results: 
So far, we are now still working on predicting the “gender” label, the experimental results show the style characteristics one (2nd model) outperforms the categories (1st model) one and the categories (1st model) outperforms the sequences model (3rd model). 

Future Work: 
As the classification models to predict anonymous browsers’ demographic attributes being proposed, we would like to explore the new dimensions to segment and cluster customers based on the web-surfing behavior. Further, by using the segmentation and customer clusters, finding ways to place high RIO internet (mobile) advertisements also attract us. 


Wednesday July 29, 2015 13:31 - 15:00 EDT
(9th Floor) TRS 3-176 (Ted Rogers School of Management) 55 Dundas Street West, Toronto, ON M5G 2C3

Attendees (0)