How to parse HTML page for extracting required information?

More Poornima Naik's questions See All

How do cloud based softwares work?

I am looking for setting up a private cloud in our computer lab for sharing softwares. What are the possible solutions?

31 December 2017 7,196 4 View

Is there any way to install visual studio 2015 on Ubuntu?

As per my knowledge Vine can only be used for installing light weight applications like MS-Office

31 December 2017 6,007 4 View

How to build a tensorflow library from source on 32-bit Windows?

I have installed python on my 32-bit windows machine. tensorflow binary is supported only for 64-bit. How to build it for 32-bit windows from source?

31 December 2017 2,003 1 View

Can ANN be used for authentication using RSA algorithm? If yes which learning algorithm to use?

Implementation details

11 December 2017 8,488 1 View

What is difference betweem validation data and test data in ANN?

Criterion for selection of training, validation and test data.

11 December 2017 7,158 1 View

Are there free tools available for checking the search engine friendliness of a webpage?

I am looking for an open source tool for converting a web page into search engine optimized web page

11 December 2017 6,075 5 View

How to implement natural language processing using conditional random fields? What is the domain knowledge required? Is it rule-based or data aware?

Which software can generate the dependency graph for parts-of-speech tokens? based on the rule set?

11 December 2017 692 1 View

How to use SVM for classifying extracted news items into IT recruitment news or not?

SVM implementation

11 December 2017 6,768 4 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

The aqueous fraction of the hydroethanolic extract is showing the presence of palmitic acid. What is the mechanism responsible ?

Palmitic acid presence in aqueous fraction

05 August 2024 8,624 4 View

Which solvent is better to dissolve with secondary metabolites extracted from fungi?

I work on MCF7 cell cell for anticaner purpose and I wa to do drug preperation the drug ( secondary metabolites extracted from Aspergillus) My question which solvent is better with these secodary...

03 August 2024 4,725 2 View

Who of all the Global Scientific community will help me Prof. Dr. Yoshida make way for TPEOM, MEC ~EMC to return the atmospheric gases to the norma ?

TEP presentation caption (The Environmental Project) Re: Why should Washington’s DC, or any country government point of location think of as nowadays of as to being 'tomorrow as to come! if it...

03 August 2024 2,484 1 View

For systematic review data extraction, should I use ITT or PP analysis, and count all randomized participants or only those who completed the study?

I am conducting a systematic review and meta analysis; as I am extracting data I realized there is a consort diagram that shows the number of patients from randomization till end of the study. So...

01 August 2024 9,993 3 View

For the moringa oleifera extract using ethanol as the solvent,what is the alternative method to concentrate the extract other than rotary evaporation?

99% pure ethanol was used for maceration also alternate methods along with the temperature and time to concentrate the extract can be specified

31 July 2024 5,113 4 View

What are the factors that causes the sample to have higher efficacy at lower concentration?

We conducted an antibacterial study of a plant extract. Varying concentrations of crude extract were subjected to microbroth dilution assay. The result showed that only the lowest concentration of...

31 July 2024 8,666 3 View

What Are the Best Alternatives to the MTT Assay for Assessing Cell Toxicity of Colored Herbal Extracts?

I am currently investigating the cytotoxicity of a series of herbal extracts, and like many studies, I have been using the MTT assay to evaluate cell viability. However, I am encountering a...

31 July 2024 193 4 View

How can I extract my bibliography from researchgate ?

how can I extract my bibliography from researchgate ?

28 July 2024 6,737 1 View

Zhao Guyu

If you can use Java, you can try jsoup.jar! Maybe you also need httpclient.jar.

Emin Ogur

You can use Node-red which does not require that much coding if you're not an experienced coder or want to do it quicker. Off course it will require some time to get familiar with it.

Ayoub Benayache

from the meta , it contains some information about the page or the site , it define some key words describe the site. crawling is another method to extract interested data form the targeted site

Dashamir Hoxha

You can use wget and other Linux tools. Here is an example: https://unix.stackexchange.com/questions/181254/how-to-use-grep-and-cut-in-script-to-obtain-website-urls-from-an-html-file

Daniel Knüttel

I would recommend python3 with a number of libraries.

To fetch the data use urllib: https://docs.python.org/3.5/library/urllib.request.html#module-urllib.request
The use either simple built-in fetures to remove all tags (if you want just text)
Or use https://pypi.python.org/pypi/BeautifulSoup
Or use an XML/HTML Library https://stackoverflow.com/questions/2505041/best-library-to-parse-html-with-python-3-and-example#2505127

The data can be accessed in a structured manner through these libraries and you will be able to extract all the data you might need.