Research directions for principles of data management (dagstuhl perspectives workshop 16151)

S Abiteboul, M Arenas, P Barceló, M Bienvenu… - 2018 - drops.dagstuhl.de
The area of Principles of Data Management (PDM) has made crucial contributions to the
development of formal frameworks for understanding and managing data and knowledge …

Schema extraction and structural outlier detection for JSON-based NoSQL data stores

M Klettke, U Störl, S Scherzinger - 2015 - dl.gi.de
Zusammenfassung Although most NoSQL Data Stores are schema-less, information on the
structural properties of the persisted data is nevertheless essential during application …

Ensuring the correctness of regular expressions: A review

LX Zheng, S Ma, ZX Chen, XY Luo - International Journal of Automation …, 2021 - Springer
Regular expressions are widely used within and even outside of computer science due to
their expressiveness and flexibility. However, regular expressions have a quite compact and …

[BUCH][B] Web data management

S Abiteboul, I Manolescu, P Rigaux, MC Rousset… - 2011 - books.google.com
The Internet and World Wide Web have revolutionized access to information. Users now
store information across multiple platforms from personal computers to smartphones and …

Learning deterministic regular expressions for the inference of schemas from XML data

GJ Bex, W Gelade, F Neven… - ACM Transactions on the …, 2010 - dl.acm.org
Inferring an appropriate DTD or XML Schema Definition (XSD) for a given collection of XML
documents essentially reduces to learning deterministic regular expressions from sets of …

A universal approach for multi-model schema inference

P Koupil, S Hricko, I Holubová - Journal of Big Data, 2022 - Springer
The variety feature of Big Data, represented by multi-model data, has brought a new
dimension of complexity to all aspects of data management. The need to process a set of …

Complexity and Expressiveness of ShEx for RDF

S Staworko, I Boneva, JEL Gayo, S Hym… - … on Database Theory …, 2015 - research.ed.ac.uk
We study the expressiveness and complexity of Shape Expression Schema (ShEx), a novel
schema formalism for RDF currently under development by W3C. A ShEx assigns types to …

Enabling information extraction by inference of regular expressions from sample entities

F Brauer, R Rieger, A Mocan… - Proceedings of the 20th …, 2011 - dl.acm.org
Regular expressions are the dominant technique to extract business relevant entities (eg,
invoice numbers or product names) from text data (eg, invoices), since these entity types …

SemMT: a semantic-based testing approach for machine translation systems

J Cao, M Li, Y Li, M Wen, SC Cheung… - ACM Transactions on …, 2022 - dl.acm.org
Machine translation has wide applications in daily life. In mission-critical applications such
as translating official documents, incorrect translation can have unpleasant or sometimes …

InfeRE: Step-by-Step Regex Generation via Chain of Inference

S Zhang, X Gu, Y Chen, B Shen - 2023 38th IEEE/ACM …, 2023 - ieeexplore.ieee.org
Automatically generating regular expressions (abbrev. regexes) from natural language
description (NL2RE) has been an emerging research area. Prior studies treat regex as a …