[PDF][PDF] Concave penalized estimation of sparse Gaussian Bayesian networks

B Aragam, Q Zhou - The Journal of Machine Learning Research, 2015 - jmlr.org
We develop a penalized likelihood estimation framework to learn the structure of Gaussian
Bayesian networks from observational data. In contrast to recent methods which accelerate …

Schema profiling of document-oriented databases

E Gallinucci, M Golfarelli, S Rizzi - Information Systems, 2018 - Elsevier
In document-oriented databases, schema is a soft concept and the documents in a collection
can be stored using different local schemata. This gives designers and implementers …

Curated databases

P Buneman, J Cheney, WC Tan… - Proceedings of the twenty …, 2008 - dl.acm.org
Curated databases are databases that are populated and updated with a great deal of
human effort. Most reference works that one traditionally found on the reference shelves of …

Inference of concise regular expressions and DTDs

GJ Bex, F Neven, T Schwentick… - ACM Transactions on …, 2010 - dl.acm.org
We consider the problem of inferring a concise Document Type Definition (DTD) for a given
set of XML-documents, a problem that basically reduces to learning concise regular …

A universal approach for multi-model schema inference

P Koupil, S Hricko, I Holubová - Journal of Big Data, 2022 - Springer
The variety feature of Big Data, represented by multi-model data, has brought a new
dimension of complexity to all aspects of data management. The need to process a set of …

Learning join queries from user examples

A Bonifati, R Ciucanu, S Staworko - ACM Transactions on Database …, 2016 - dl.acm.org
We investigate the problem of learning join queries from user examples. The user is
presented with a set of candidate tuples and is asked to label them as positive or negative …

SemMT: a semantic-based testing approach for machine translation systems

J Cao, M Li, Y Li, M Wen, SC Cheung… - ACM Transactions on …, 2022 - dl.acm.org
Machine translation has wide applications in daily life. In mission-critical applications such
as translating official documents, incorrect translation can have unpleasant or sometimes …

Extracting structured information from Wikipedia articles to populate infoboxes

D Lange, C Böhm, F Naumann - Proceedings of the 19th ACM …, 2010 - dl.acm.org
Roughly every third Wikipedia article contains an infobox-a table that displays important
facts about the subject in attribute-value form. The schema of an infobox, ie, the attributes …

InfeRE: Step-by-Step Regex Generation via Chain of Inference

S Zhang, X Gu, Y Chen, B Shen - 2023 38th IEEE/ACM …, 2023 - ieeexplore.ieee.org
Automatically generating regular expressions (abbrev. regexes) from natural language
description (NL2RE) has been an emerging research area. Prior studies treat regex as a …

Making data platforms smarter with MOSES

M Francia, E Gallinucci, M Golfarelli, AG Leoni… - Future Generation …, 2021 - Elsevier
The rise of data platforms has enabled the collection and processing of huge volumes of
data, but has opened to the risk of losing their control. Collecting proper metadata about raw …