Home
Browse Resources
Help
About
What is ELRC-SHARE
LR Provision
Access to ELRC-SHARE Language Resources
Licensing LRs for the ELRC action
Notice and Takedown Policy
Disclaimers and Limitation of Liability
Log information, cookies and analytics
Data Protection Record
Register
Login
9
Last view: 2024-12-08
2
Last download: 2023-04-21
Monolingual Albanian corpus from websites of government of Albania (part 1)
Monolingual Albanian corpus was originated from websites of the public sector of Albania
Back
Download
Distribution
Availability:
Available
Licences
Open Under-PSI
Used for resources that fall under the scope of PSI (Public Sector Information) regulations, and for which no further information is required or available. For more information on the EU legislation on the reuse of Public Sector Information, see here: https://ec.europa.eu/digital-single-market/en/european-legislation-reuse-public-sector-information.
Distribution Details
Contact Person
Prokopis Prokopidis
http://nlp.ilsp.gr/~...
Institute for Language and Speech Processing / Athena Research Center
ILSP / ATHENA R.C.
[javascript protected email address]
Artemidos 6 & Epidavrou
GR-151 25 Maroussi
Greece (GR)
Tel.: +30 2106875432
http://www.ilsp.gr/
,
http://www.athenarc.gr
ILSP / ATHENA R.C.
Greece
text
Monolingual text corpus
Languages
Albanian (sq)
Language Script:
Latin
Linguality
Linguality type:
Monolingual
Text Format
Plain Text
Size
477,924 Segments
12,353,899 Tokens
Character encoding
UTF-8
Creation
Creation mode details:
Modules of the ILSP Focused Crawler was used for the acquisition, text extraction, normalization, sentences spliting, and deduplication on sentence level.
Creation mode:
Automatic
Creation Tools
http://nlp.ilsp.gr/r...
Resource Creation
Created using ELRC Services
Funding Project
European Language Resource Coordination 3.0
(ELRC3.0 - SMART 2019/1083 LC-01325001)
URL:
http://www.lr-coordi...
Funding Type:
Eu Funds
Funder:
European Commission
Funding Country:
European Union (EU)
Metadata
Created:
22/09/2016
Last Updated:
12/04/2017
Metadata Language:
English (en)
Metadata Creator
Maria Giagkou
Institute for Language and Speech Processing / Athena Research Center
ILSP / ATHENA R.C.
[javascript protected email address]
Greece (GR)
http://www.ilsp.gr
,
http://www.athenarc.gr
ILSP / ATHENA R.C.
Greece
Kanella Pouli
[javascript protected email address]
Greece (GR)
Version
Version:
1.0
Last Updated:
23/02/2021
Resources from the same project