How to calculate the overall similarity of text stories using cosine similarity in Python?

More Tahir Abbas's questions See All

What is the difference between mathematical R^4 space and physical 4D unit space?

We assume that the difference is huge and that it is not possible to compare the two spaces. The R^4 mathematical space considers time as an external controller and the space itself is immobile in...

10 August 2024 6,678 14 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

How to generate a citation of my paper from ResearchGate?

How we can cite the papers from ResearchGate. I am trying to create citations for this article, Quantum Machine Learning Algorithms for Optimization Problems: Theory, Implementation, and...

08 August 2024 6,690 3 View

Does Anyone have expertise in in vitro transcription and RNA pull down assay?

I am currently working on LncRNA; to know the lncRNA-protein interactions I want to do RNA pull down assay, so I need to design primers with T7 promoter. I need assistance in this regard.

07 August 2024 6,622 1 View

How to fix background error in rietveld refinement of one XRD peak using GSAS-II?

I want to refine one XRD peak of my in-situ xrd but the background is never working good which ultimately fails the refinement. How to refine and adjust the background using GSAS-II

05 August 2024 5,291 2 View

How can I add own Henry coefficients in Aspen Plus?

Hi, i would like to simulate an absorption process in Aspen Plus. I want to use the NRTL model und would like to add some individual Henry coefficients. Is that possible and how?

05 August 2024 2,333 2 View

Why might the impedance values for DI water and 0.1X PBS buffer solution exhibit a decreasing and increasing trend, respectively over time (HP 4194A)?

Hello everyone, I'm encountering an issue with my electrochemical impedance spectroscopy (EIS) measurements and would appreciate some insights. Experimental Setup: Electrodes: Gold interdigitated...

05 August 2024 3,783 2 View

Can usage of AI tools like chat GPT in research work is recommendable ?

AI tools like ChatGPT can enhance research work significantly when used responsibly and in conjunction with thorough human oversight.

05 August 2024 1,842 3 View

Usage of internal standards in LC-MS/MS analysis?

Have you ever seen a LC-MS/MS method uses both internal standards and external standards (in matrix matching purpose) but the concentrations of internal standards are outside the calibration curve...

05 August 2024 3,084 6 View

ANY free software for reconstructing neurons in the microscopic image?

Hi everyone, I am working on brain slices for visualizing a protein in the soma and dendrites, using a fluorescence tag. However, I need a tool (not paid) for reconstruction of the whole neuron,...

04 August 2024 4,725 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Request Python code?

Request Python code from this article : Gender equity of authorship in pulmonary medicine over the past decade. THANKS!

08 August 2024 6,242 2 View

Why does everyone use vs code?

Visual Studio Code (VS Code) has become a popular choice among developers for several reasons: 1. **Free and Open Source**: VS Code is free to use and open source, making it accessible to...

07 August 2024 7,013 4 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Repeated measures ANOVA, ANCOVA or Regression?

Would anyone be able to advise me... I have an RCT with a control and experimental group. Participants were recruited from one school (n=59). Participants were assessed using repeated measures on...

04 August 2024 4,040 6 View

Non-parametric version of the wo-way repeated measures ANOVA?

Hi, can someone please help me with their expertise. To analyse a dataset with 2 IV´s (2 levels each) and 1 DV (time - measured at 4 points, i. e. 4 DV?) I found the two-way repeated measures...

24 July 2024 3,284 6 View

How to do FEL analysis?

In molecular dynamics simulation, to get FEL analysis, I got an error. My Python version is 3.10.7. My input files are made with a lower version of Python. But the final command to generate the...

23 July 2024 5,646 2 View

Mass spectra averaging algorithm?

I am now developing a python module for ms2 database searching, would like to realize a function that similar to what Xcalibur did, choose multiple mass spectra and get an averaged spectra. But...

22 July 2024 3,975 1 View

What analysis to use for an dependent variable with repeated measures and a independent variable only measured once?

Hi all, I am trying to use mixed effect model to analyze my data, which including a baseline measurement for my exposure (A), and repeated measurements for the outcome (B). I do have some...

17 July 2024 8,682 3 View

Fábio Lobato

Dear Tahir,

this is a very trick question. I'm seeing your problem as one another that I faced some time ago. The approach that I used was to perform a "against-all" comparison.

Imagine that we have four stores (a,b,c,d)... I made the following comparisons: a-a (equal to 1, for sure!), a-b, a-c, a-d, b-b (1 again), b-c .... till d-d. At the end... I obtained a triangular matrix with overall comparisons.

As a index of the story a I used the median of the line 0, for b... the median of line 1 and so on. Off course that you can evaluate statistic measure to your problem.

I'm very interested in your results. Please, add me and let me know if this approach was suitable for you.

Regards,

Fábio

Tahir Abbas

Dear Fabio,

Thanks a lot!

I am wondering why not you used cosine similarity! It does exactly the same. However, I am interested in the second part of your story: the median.

I used mean. What was your reason to use median? could you please explain?

Have you published a paper based on those results? If yes, I would appreciate if you can send me the reference as well.

Thanks for your time.

Hi Tahir,

I used median because I had few data and I was expecting to observe outliers. In case you don't have outliers for sure, use the mean.

I didn't publish these results because it was developed for a company and the confidentiality agreement is now allowing any publication for now.

Thanks for your reply!