Dear Researchers,
`Since I am new in the field of internet security, I need your suggestion regarding the meaning of the following features.
We have DNS google.com or youtube.com, and so on, and I want to extract different features based on Lexical and Web Scrapped.
Lexical Features:
what is the meaning of the following features? Please write with an example.
1) different ratios (different ratios (number to length, alphabet to length) ?
2) hash?
3) distance between a number to an alphabet? (You can find the meaning of these features in the paper Feature Extraction Approach to Unearth Domain Generating Algorithms (DGAs) - Page 401)
4) English domain name, not English yet pronounceable domain names, uni-gram?
Web Scrapping:
we extract information of the queried domain name from the web using Python (You can find the meaning of these features in the paper Feature Extraction Approach to Unearth Domain Generating Algorithms (DGAs) - Page 403)).
1) Levenshtein distance (sq1,se2), what is seq2?
2) Typosquat process?
Thanks