Home
Browse Resources
Help
About
What is ELRC-SHARE
LR Provision
Access to ELRC-SHARE Language Resources
Licensing LRs for the ELRC action
Notice and Takedown Policy
Disclaimers and Limitation of Liability
Log information, cookies and analytics
Data Protection Record
Register
Login
46
Last view: 2024-10-22
3
Last update: 2020-02-19
8
Last download: 2025-01-20
Monolingual Bulgarian corpus in the culture domain
Monolingual Bulgarian corpus, containing 3720190 tokens and 376158 lexical types in the culture domain.
DSI Relevance:
Europeana
Back
Download
Distribution
Availability:
Available
Licences
Open Under-PSI
Used for resources that fall under the scope of PSI (Public Sector Information) regulations, and for which no further information is required or available. For more information on the EU legislation on the reuse of Public Sector Information, see here: https://ec.europa.eu/digital-single-market/en/european-legislation-reuse-public-sector-information.
Distribution Details
IPR Holders
Ministry of Culture, Republic of Bulgaria
http://mc.government...
Ministry of Culture, Republic of Bulgaria
[javascript protected email address]
Varna Municipality
http://www.varnacult...
Varna Municipality
Bulgaria (BG)
Municipal Foundation Plovdiv 2019
https://plovdiv2019.eu
Municipal Foundation Plovdiv 2019
Bulgaria (BG)
Sofia Municipal Council
http://www.sofia-da....
Sofia Municipal Council
Bulgaria (BG)
Contact Person
Prokopis Prokopidis
http://nlp.ilsp.gr/~...
Institute for Language and Speech Processing / Athena Research Center
ILSP / ATHENA R.C.
[javascript protected email address]
Artemidos 6 & Epidavrou
GR-151 25 Maroussi
Greece
Tel.: +30 2106875432
http://www.ilsp.gr/
,
http://www.athenarc.gr
ILSP / ATHENA R.C.
Greece
text
Monolingual text corpus
Languages
Bulgarian (bg)
Language Script:
Cyrillic
Linguality
Linguality type:
Monolingual
Text Format
XML
Size
376,158 Lexical Types
3,720,190 Tokens
Character encoding
UTF-8
Domains
SOCIAL QUESTIONS
Culture And Religion (Eurovoc 2831)
EUROVOC
Creation
Creation mode details:
The ILSP Focused Crawler was used for the acquisition of monolingual data from websites, and for the normalization, cleaning, (near)deduplication on document level.
Creation mode:
Automatic
Creation Tools
http://nlp.ilsp.gr/r...
Resource Creation
Created using ELRC Services
Funding Project
Connecting Europe Facility-European Language Resource Coordination
(CEF-ELRC - LANGUAGE RESOURCE COORDINATION-SMART 2014/1074-30-CE-0696785/00-64)
URL:
http://www.lr-coordi...
Funding Type:
Service Contract
Funder:
European Commission
Funding Country:
European Union (EU)
Project duration:
29/03/2015 - 16/04/2017
Metadata
Created:
22/09/2016
Last Updated:
12/04/2017
Metadata Language:
English (en)
Metadata Creator
Kanella Pouli
[javascript protected email address]
Greece
Maria Giagkou
Institute for Language and Speech Processing / Athena Research Center
ILSP / ATHENA R.C.
[javascript protected email address]
Greece (GR)
http://www.ilsp.gr
,
http://www.athenarc.gr
ILSP / ATHENA R.C.
Greece
Version
Version:
1.0
Relations
Related Resource:
Monolingual Bulgarian corpus in the culture domain (part 1) (Processed)
Relation Type:
Has Part
Related Resource:
Monolingual Bulgarian corpus in the culture domain (part 2) (Processed)
Relation Type:
Has Part
Resources from the same project