A comprehensive survey on automatic knowledge graph construction
Automatic knowledge graph construction aims at manufacturing structured human
knowledge. To this end, much effort has historically been spent extracting informative fact …
knowledge. To this end, much effort has historically been spent extracting informative fact …
A survey of deep active learning
Active learning (AL) attempts to maximize a model's performance gain while annotating the
fewest samples possible. Deep learning (DL) is greedy for data and requires a large amount …
fewest samples possible. Deep learning (DL) is greedy for data and requires a large amount …
Can foundation models wrangle your data?
Foundation Models (FMs) are models trained on large corpora of data that, at very large
scale, can generalize to new tasks without any task-specific finetuning. As these models …
scale, can generalize to new tasks without any task-specific finetuning. As these models …
Deep entity matching with pre-trained language models
We present Ditto, a novel entity matching system based on pre-trained Transformer-based
language models. We fine-tune and cast EM as a sequence-pair classification problem to …
language models. We fine-tune and cast EM as a sequence-pair classification problem to …
Neo: A learned query optimizer
Query optimization is one of the most challenging problems in database systems. Despite
the progress made over the past decades, query optimizers remain extremely complex …
the progress made over the past decades, query optimizers remain extremely complex …
Data-driven materials research enabled by natural language processing and information extraction
Given the emergence of data science and machine learning throughout all aspects of
society, but particularly in the scientific domain, there is increased importance placed on …
society, but particularly in the scientific domain, there is increased importance placed on …
Table-gpt: Table-tuned gpt for diverse table tasks
Language models, such as GPT-3.5 and ChatGPT, demonstrate remarkable abilities to
follow diverse human instructions and perform a wide range of tasks. However, when …
follow diverse human instructions and perform a wide range of tasks. However, when …
A benchmarking study of embedding-based entity alignment for knowledge graphs
Entity alignment seeks to find entities in different knowledge graphs (KGs) that refer to the
same real-world object. Recent advancement in KG embedding impels the advent of …
same real-world object. Recent advancement in KG embedding impels the advent of …
[KNJIGA][B] Data cleaning
This is an overview of the end-to-end data cleaning process. Data quality is one of the most
important problems in data management, since dirty data often leads to inaccurate data …
important problems in data management, since dirty data often leads to inaccurate data …
Table-gpt: Table fine-tuned gpt for diverse table tasks
Language models, such as GPT-3 and ChatGPT, demonstrate remarkable abilities to follow
diverse human instructions and perform a wide range of tasks, using instruction fine-tuning …
diverse human instructions and perform a wide range of tasks, using instruction fine-tuning …