[КНИГА][B] Compression and coding algorithms

A Moffat, A Turpin - 2002 - books.google.com
An authoritative reference to the whole area of source coding algorithms, Compression and
Coding Algorithms will be a primary resource for both researchers and software engineers …

Transformations for the compression of FASTQ quality scores of next-generation sequencing data

R Wan, VN Anh, K Asai - Bioinformatics, 2012 - academic.oup.com
Motivation: The growth of next-generation sequencing means that more effective and
efficient archiving methods are needed to store the generated data for public dissemination …

Matching demand and offer in on-line provision: A longitudinal study of monster. com

A Capiluppi, A Baravalle - 2010 12th IEEE International …, 2010 - ieeexplore.ieee.org
When considering the jobs market, changes or recurring trends for skilled employees
expressed by employers' needs have a tremendous impact on the evolution of website …

[PDF][PDF] Semantic lossy compression of XML data

VPB ISI-CNR - 2001 - academia.edu
In the last years a large amount of semistructured data [1, 10] has been managed and
exchanged. The largest repository of semistructured data is the World Wide Web, which can …

CROSSWORD: A Semantic Approach To Text Compression Via Masking

M Li, R **, L **ang, K Shen… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Conventional data compression methods typically model the information source as an iid
stochastic process, thereby establishing the fundamental limit as entropy for lossless …

On the application of wavelet transform and Huffman algorithm to Yorùbá language syntax text files compression

KA Amusa, A Adewusi, TC Erinosho, SA Salawu… - SJEE, 2022 - 91.187.132.54
Most algorithms of data compression were developed with English language as target text
syntax. However, this paper approaches the problem of Yorùbá text files compression via …

Patterns count-based labels for datasets

Y Moskovitch, HV Jagadish - 2021 IEEE 37th International …, 2021 - ieeexplore.ieee.org
Counts of attribute-value combinations are central to the profiling of a data set, particularly in
determining fitness for use and in eliminating bias and unfairness. While counts of individual …

Semi-lossless text compression

Y Kaufman, ST Klein - … Journal of Foundations of Computer Science, 2005 - World Scientific
A new notion, that of semi-lossless text compression, is introduced, and its applicability in
various settings is investigated. First results suggest that it might be hard to exploit the …

[PDF][PDF] Preprocessing for PPM: compressing UTF-8 encoded natural language text

WJ Teahan, KM Alhawiti - International Journal of Computer Science …, 2015 - academia.edu
In this paper, several new universal preprocessing techniques are described to improve
Prediction by Partial Matching (PPM) compression of UTF-8 encoded natural language text …

A framework for abstracting data sources having heterogeneous representation formats

D Rosaci, G Terracina, D Ursino - Data & Knowledge Engineering, 2004 - Elsevier
This paper deals with the issue of abstracting a data source characterized by one among
several possible representation formats. First we show that data source abstraction plays a …