Hanja
Getting hanja characters for korean words
 
Loading...
Searching...
No Matches
src.data_access.DataAccess Class Reference

A class to handle database operations for managing Korean words and their associated Hanja characters. More...

Collaboration diagram for src.data_access.DataAccess:
Collaboration graph

Public Member Functions

 initialize_database (self)
 Initializes the database by creating necessary tables if they don't already exist.
 
 insert_data (self, processed_data)
 Inserts processed data into the 'korean_words' table.
 
 insert_hanja_data (self, hanja_dict)
 Inserts Hanja data into the 'hanja_characters' table.
 
 drop_tables (self)
 Drops the 'hanja_characters' and 'korean_words' tables if they exist.
 
 remove_duplicates (self)
 
 get_hanja_for_word (self, word)
 Retrieves Hanja associated with a given Korean word.
 
 get_hanja_meanings_for_word (self, word, hanja_list, language)
 Retrieves Hanja meanings of every characters associated with a given Korean word.
 
 get_word_by_korean (self, korean_word, language, hanja_characters=None)
 
 get_related_words (self, hanja_character, language)
 Gets all the words that contains the specified hanja character.
 
 get_autocomplete_words (self, query, limit=10)
 Fetches a list of word suggestions that start with the provided query.
 
 find_word_with_unique_hanja (self)
 Finds a Korean word where at least one Hanja character is unique to that word.
 

Detailed Description

A class to handle database operations for managing Korean words and their associated Hanja characters.

Member Function Documentation

◆ drop_tables()

src.data_access.DataAccess.drop_tables (   self)

Drops the 'hanja_characters' and 'korean_words' tables if they exist.

◆ find_word_with_unique_hanja()

src.data_access.DataAccess.find_word_with_unique_hanja (   self)

Finds a Korean word where at least one Hanja character is unique to that word.

Returns
list of words with unique Hanja characters.

◆ get_autocomplete_words()

src.data_access.DataAccess.get_autocomplete_words (   self,
  query,
  limit = 10 
)

Fetches a list of word suggestions that start with the provided query.

Parameters
queryThe prefix to search for.
limitMaximum number of results to return.
Returns
: A list of matching word strings.

◆ get_hanja_for_word()

src.data_access.DataAccess.get_hanja_for_word (   self,
  word 
)

Retrieves Hanja associated with a given Korean word.

Parameters
word(str): The Korean word for which to fetch Hanja character.
Returns
a list of tuples containing Hanja character. Returns None if no data is found.

◆ get_hanja_meanings_for_word()

src.data_access.DataAccess.get_hanja_meanings_for_word (   self,
  word,
  hanja_list,
  language 
)

Retrieves Hanja meanings of every characters associated with a given Korean word.

Parameters
word(str): The Korean word for which to fetch Hanja meanings.
hanja_list(list): The list of hanja associated to the initial words.
language(str): The language of the page.
Returns
a list of tuples containing Hanja character, Korean pronunciation, and meaning. Returns None if no data is found.

◆ get_related_words()

src.data_access.DataAccess.get_related_words (   self,
  hanja_character,
  language 
)

Gets all the words that contains the specified hanja character.

Parameters
hanja_characterthe hanja character to search for.
languageThe language for the definition
Returns
: A list of matching entries.

◆ get_word_by_korean()

src.data_access.DataAccess.get_word_by_korean (   self,
  korean_word,
  language,
  hanja_characters = None 
)
@brief Fetches a word entry by its Korean text.

@param korean_word: The Korean word to search for.
@param language: The language for the definition
@return: A list of matching entries.

◆ initialize_database()

src.data_access.DataAccess.initialize_database (   self)

Initializes the database by creating necessary tables if they don't already exist.

Database Schema:

korean_words

  • id (INTEGER, PRIMARY KEY): Unique identifier for each word.
  • word (TEXT, NOT NULL): Korean word.
  • hanja (TEXT): Associated Hanja characters.
  • glossary (TEXT): Word's glossary or definition in Korean.
  • englishLemma (TEXT): Lemma/word in English.
  • englishDefinition (TEXT): English definition of the word.
  • frenchLemma (TEXT): Lemma/word in French.
  • frenchDefinition (TEXT): French definition of the word.

hanja_characters

  • id (INTEGER, PRIMARY KEY AUTOINCREMENT): Unique identifier for each Hanja character.
  • character (TEXT, NOT NULL, UNIQUE): The Hanja character.
  • korean (TEXT, NOT NULL): Korean pronunciation of the Hanja.
  • englishDefinition (TEXT): Meaning of the Hanja character in english.
  • frenchDefinition (TEXT): Meaning of the Hanja character in french.
  • pronounciation (TEXT): html link of audio for the word's pronounciation.

◆ insert_data()

src.data_access.DataAccess.insert_data (   self,
  processed_data 
)

Inserts processed data into the 'korean_words' table.

Parameters
processed_data(list): A list of dictionaries containing word data. Each dictionary should have:
  • id (int): The unique ID for the word (optional if the database assigns it automatically).
  • word (str): The Korean word.
  • hanja (str): Associated Hanja characters.
  • glossary (str): Meaning of the word in Korean.
  • englishLemma (str): Lemma/word in English.
  • englishDefinition (str): English definition of the word.
  • frenchLemma (str): Lemma/word in French.
  • frenchDefinition (str): French definition of the word.
  • pronounciation (str): html link of audio for the word's pronounciation.

◆ insert_hanja_data()

src.data_access.DataAccess.insert_hanja_data (   self,
  hanja_dict 
)

Inserts Hanja data into the 'hanja_characters' table.

Parameters
hanja_dict(dict): A dictionary mapping Hanja characters to a list of related Korean data. Example:
{
'漢': [{'kor': '한', 'englishDefinition': 'China'}, 'frenchDefinition': 'Chine'}],
'字': [{'kor': '자', 'englishDefinition': 'Character'}, 'frenchDefinition': 'Caractère'}]
}

◆ remove_duplicates()

src.data_access.DataAccess.remove_duplicates (   self)

The documentation for this class was generated from the following file: