How to calculate the overall similarity of text stories using cosine similarity in Python?

More Tahir Abbas's questions See All

How crowd-sourcing is collaborative in nature?

Any help would be appreciated!

31 December 2018 7,286 0 View

Do you know any collaborative but sequential work efforts/work-flows/models in computer supported co-operative work ?

Hi all, I am a PhD student looking for some papers in CSCW which discuss about sequential workflows or efforts that are collaborative in nature. with kind regards, Tahir

31 December 2018 7,604 0 View

How to find social richness of a sentence/response?

Hi, I am curious whether we can find the social richness of a response among the many given based on the query asked? For example: Query: How are you? response 1: i am fine response 2: thanks...

09 October 2018 1,739 3 View

How can I connect NAO with python 2.7.3 to Watson Speech to text service?

Hi all, I am trying to use watson cloud speech to text api with NAO robot v4 (NAOQI version = 2.1.4) so that when I speak to robot, it records my sound and send it to watson API for...

06 July 2018 6,711 1 View

How can we summarize conversation using machine learning and data mining techniques?

Can someone share with me important links, books or tutorials?

06 July 2018 1,903 7 View

How can I create python based real time web application to control a robot?

I have a NAO robot which runs on linux machine and has IP address. I want to create a simple python based application to control NAO remotely by a web client. My NAO is on local network of the...

05 June 2018 8,911 1 View

Am I using the correct configuration of data in SPSS for inter class correlation and cronbach's alpha?

I have invited 20 raters to evaluate 10 stories on two 7 point likert scale (originality and practicality). I want to find the reliability for both originality and practicality separately. Sample...

01 February 2018 9,372 2 View

What is acceptable value of PRESS statistics of Partial least square regression in social sciences?

Data set size = 283 IVs = 4 DV = 1 Factor = 1 (out of 4, one was selected) Results Press = 0.91 X explained 58%, Y explained 18%

01 February 2018 6,214 2 View

How to deal with missing values in repeated measures ANOVA?

Should I need to remove those subjects? or is there any solution to deal this problem?

01 February 2018 6,703 0 View

How to report Partial Least Square Results from SPSS?

Please share any useful links

01 February 2018 6,149 1 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Request Python code?

Request Python code from this article : Gender equity of authorship in pulmonary medicine over the past decade. THANKS!

08 August 2024 6,242 2 View

Why does everyone use vs code?

Visual Studio Code (VS Code) has become a popular choice among developers for several reasons: 1. **Free and Open Source**: VS Code is free to use and open source, making it accessible to...

07 August 2024 7,013 4 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Repeated measures ANOVA, ANCOVA or Regression?

Would anyone be able to advise me... I have an RCT with a control and experimental group. Participants were recruited from one school (n=59). Participants were assessed using repeated measures on...

04 August 2024 4,040 6 View

Non-parametric version of the wo-way repeated measures ANOVA?

Hi, can someone please help me with their expertise. To analyse a dataset with 2 IV´s (2 levels each) and 1 DV (time - measured at 4 points, i. e. 4 DV?) I found the two-way repeated measures...

24 July 2024 3,284 6 View

How to do FEL analysis?

In molecular dynamics simulation, to get FEL analysis, I got an error. My Python version is 3.10.7. My input files are made with a lower version of Python. But the final command to generate the...

23 July 2024 5,646 2 View

Mass spectra averaging algorithm?

I am now developing a python module for ms2 database searching, would like to realize a function that similar to what Xcalibur did, choose multiple mass spectra and get an averaged spectra. But...

22 July 2024 3,975 1 View

What analysis to use for an dependent variable with repeated measures and a independent variable only measured once?

Hi all, I am trying to use mixed effect model to analyze my data, which including a baseline measurement for my exposure (A), and repeated measurements for the outcome (B). I do have some...

17 July 2024 8,682 3 View

Fábio Lobato

Dear Tahir,

this is a very trick question. I'm seeing your problem as one another that I faced some time ago. The approach that I used was to perform a "against-all" comparison.

Imagine that we have four stores (a,b,c,d)... I made the following comparisons: a-a (equal to 1, for sure!), a-b, a-c, a-d, b-b (1 again), b-c .... till d-d. At the end... I obtained a triangular matrix with overall comparisons.

As a index of the story a I used the median of the line 0, for b... the median of line 1 and so on. Off course that you can evaluate statistic measure to your problem.

I'm very interested in your results. Please, add me and let me know if this approach was suitable for you.

Regards,

Fábio

Tahir Abbas

Dear Fabio,

Thanks a lot!

I am wondering why not you used cosine similarity! It does exactly the same. However, I am interested in the second part of your story: the median.

I used mean. What was your reason to use median? could you please explain?

Have you published a paper based on those results? If yes, I would appreciate if you can send me the reference as well.

Thanks for your time.

Hi Tahir,

I used median because I had few data and I was expecting to observe outliers. In case you don't have outliers for sure, use the mean.

I didn't publish these results because it was developed for a company and the confidentiality agreement is now allowing any publication for now.

Thanks for your reply!