News register
Search section
Search area
News Type
News type
Search date
Search word OR
List of related articles
Contents of related articles
No info was found
List of related articles
Contents of related articles
No info was found
View details
Information

01/08/2018 HYU News > Academics > 이달의연구자

Title

[Excellent R&D] Big Data and the Key to Handling Them

Kim Sang-wook (Department of Computer Science)

전채윤

Copy URL / Share SNS

http://www.hanyang.ac.kr/surl/1c5S

Contents
In the society where social networking is becoming more and more inseparable from people, an ever-increasing number of users are getting involved. As a consequence, the ocean of big data in corresponding area is expanding its capacity, and there has been a need to efficiently analyze and organize the data. In his Big Data Science Laboratory, Kim Sang-wook (Department of Computer Science) has been continuously researching the topic. In his recent paper “High-performance graph data processing on a single machine,” Kim has proposed a method to increase the performance of data processing and to efficiently arrange the mass of data.

A graph or a network is a complex arrangement of nods and edges, which are the components of an online world such as its users and webpages and the relationships they have, respectively. In a social network, for example, each user will be labeled as a nod and the relationships that users have with other users or webpages will be marked as edges. “Where could this graph be used? Numerous types of data could be modeled in the form of this graph. For example, Facebook users and their friends, bloggers and their neighbors, and the recommender system of search engines such as Youtube, Amazon and more are all related to the graph of nods and edges.” Depending on who views what how many times or which page receives the most views, weights could be added onto the edge between the user and the page, zooming out of which will form a complex web of a graph.  
 
Big data is usually calculated in a matrix, the process which is made more efficient by Kim.
(Photo courtesy of Kim)

How Kim made the graph data processing more efficient is by creating three constructive approaches. First, he made matrix multiplication of data simpler and easier by balancing the load over each thread blocks of the matrix. When there is a poor balancing of load input in each row of the matrix, the multiplication process could take a long time and the performance might not be excellent. With the balanced threads of the matrix, however, even distribution of workloads would resolve this problem and it would be much less time-consuming compared to the previous method. Second, Kim created a graph engine, which is a storing software that handles data in a productive manner. In order to analyze a graph, the data must be saved in a disc first. In doing so, the tool that helps the disc to save the data more efficiently is the graph engine, which Kim proposed in his paper.

“The strength of our laboratory is that we research on two aspects of data. By researching the performance-wise aspect of the data and also the analytical aspect, we leave no chance of missing a single detail of matter.” Thirdly, Kim introduced a placement algorithm that could simplify the arrangement of nods in a graph engine. Previously, when a graph undergoes a process of analysis in a graph engine, the data was put in the exact same order as it entered. Clusters of irrelevant nods could cause a delay in the data processing, which Kim solved by discovering that by sorting the nods of similar traits together, the overall performance of graph processing could show a big difference. With the same data, different outcomes could be derived by finding out the advantageous groupings of nods.

With his current research of graph engine and graph modeling, he could use them as stepping stones to move onto his next research. Kim’s future research is directed toward community detection and recommender systems. With the modeled graph of data, analysis of the data could easily be made and the members of a social community with similar interests could conveniently be detected. On a similar note, a recommender system could be improved by analyzing what a user likes, clicks, views, buys, or prefers with the graph: a more accurate recommender system could be developed. With the building blocks he has worked on, Kim will be building on more as he carries on his future research.
 
"Characteristics of the data could be figured out by analyzing the graphs."



Jeon Chae-yun
        chaeyun@hanyang.ac.kr
Photos by Kang Cho-hyun
 
Copy URL / Share SNS

0 Comments