[PDF][PDF] Web content topic modeling using LDA and HTML tags

HHM Altarturi, M Saadoon, NB Anuar - PeerJ Computer Science, 2023 - peerj.com
An immense volume of digital documents exists online and offline with content that can offer
useful information and insights. Utilizing topic modeling enhances the analysis and …

Company2Vec—German Company Embeddings Based on Corporate Websites

C Gerling - International Journal of Information Technology & …, 2024 - World Scientific
With Company2Vec, the paper proposes a novel application in representation learning. The
model analyzes business activities from unstructured company website data using …

Building a sample frame of SMEs using patent, search engine, and website data

SK Arora, S Kelley, S Madhavan - Journal of Official Statistics, 2021 - journals.sagepub.com
This research outlines the process of building a sample frame of US SMEs. The method
starts with a list of patenting organizations and defines the boundaries of the population and …

An efficient graph‐based peer selection method for financial statements

S Noels, S De Ridder, S Viaene… - Intelligent Systems in …, 2023 - Wiley Online Library
Comparing companies can be useful for various purposes. Despite the widespread use of
industry classification systems as a peer selection standard, these have been criticized for …

Peer firm identification using word embeddings

T Kee - 2019 IEEE International Conference on Big Data (Big …, 2019 - ieeexplore.ieee.org
In the task of peer firm identification, researchers have relied on existing industry
classification system regardless of their critical limitations. In the existing industry …

Exploring a knowledge-based approach to predicting NACE codes of enterprises based on web page texts

H Kühnemann, A van Delden… - Statistical Journal of the …, 2020 - content.iospress.com
Classification of enterprises by main economic activity according to NACE codes is a
challenging but important task for national statistical institutes. Since manual editing is time …

Constructing economic taxonomy reflecting firm relationships based on news reports

Z Zhou, X Mu, X Lin - Data Technologies and Applications, 2022 - emerald.com
Purpose This paper aims to propose a novel approach to constructing an economic
taxonomy that demonstrates the complex relationships between firms, which are not fully …

Cyber Parental Control Framework for Objectionable Web Content Classification and Filtering Based on Topic Modelling Using Enhanced Latent Dirichlet Allocation

HHM Altarturi - 2023 - search.proquest.com
The escalating concern revolves around cybersecurity for children, given the unprecedented
internet access that potentially exposes them to objectionable content. Recent data highlight …

Surveying the. NZ Top Level Domain: Business Sector Categorisation

T Borsje - Wellington Faculty of Engineering Symposium, 2023 - ojs.victoria.ac.nz
There is a vast amount of information present across “. NZ” domains, but no publicly
accessible or viable tools or resources exist to categorise them. All currently existing …