LinkedIn Knowledge Graph – KDnuggets Interview

LinkedIn Knowledge Graph – KDnuggets Interview

LinkedIn Knowledge Graph – KDnuggets Interview

We interview LinkedIn about their recently published LinkedIn Knowledge Graph which connects their many millions of members, jobs, companies, and more.

LinkedIn recently published The LinkedIn Knowledge Graph (LKG) . It is an impressive achievement, connecting 450M members, 190M historical job listings, 9M companies, 200+ countries, 35K skills in 19 languages, 28K schools, 1.5K fields of study, 600+ degrees, 24K titles in 19 languages, and 500+ certificates, among other entities, as of Oct 6, 2016.

I had an opportunity to ask LinkedIn a few questions, and here are the answers from Bee-Chung Chen , Senior Staff Engineer & Applied Researcher at LinkedIn and Deepak Agarwal , VP of Engineering, Head of Relevance at LinkedIn, two of the leaders of the LKG project.

"Data Scientist" is the canonical form of a title entity in the taxonomy. A member or a job with title string "Data Mining Scientist" is standardized to title "Data Scientist" by our title standardizer (a supervised binary classifier) based on title string features and other member/job metadata (e.g., the skills of the member or the skills required by the job).

Read Also:
What is the real meaning of a ‘Smart City’?

However, not all similar title strings can be mapped to the same entity by this supervised method, e.g., "Predictive Analytics Specialist" is not standardized to "Data Scientist", partially because collecting high-quality and high-volume training data for this task is challenging.

To augment the binary decision in such an entity-level standardization task, we also provide the similarity among these three title strings in the following two ways simultaneously. First, LinkedIn title taxonomy has a hierarchical structure: title → super title → function, which enables a higher-level similarity. For example, these three title strings can all belong to the same super title and/or the same function.

Downstream data mining applications can select the most suitable title granularity level.



Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Five Steps for Using Analytics to Transform Your Business

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
How Bad Data Undermines Business Results
Read Also:
Hadoop: The New Data Warehouse

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Data relevance: how to turn ‘straw into gold’

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
Data Lakes, Explained

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Data hoarding site represents the dark side of data breach monitoring

Leave a Reply

Your email address will not be published. Required fields are marked *