Tong Guo

130 Questions 23 Answers 0 Followers

Questions related from Tong Guo

What exactly is RAG-LLM doing? Isn’t it data engineering?

What exactly is Retrieval Augmented Generation for Large Language Model doing? Isn’t it data engineering?

31 July 2024 7,402 3 View

After a lot of feature engineering for CTR modeling, it feels like it's basically the end of iteration? I mean, it's not cost-effective to keep doing?

After a lot of feature engineering for click-through rate modeling, it feels like it's basically the end of iteration? I mean, it's not cost-effective to keep doing it?

30 July 2024 4,985 0 View

All math can be explained by iterator of code?

all math can be traversed by code? all math can be translate to code?

27 July 2024 9,564 0 View

What is the effect for the CTR model, adding the tag-id with the highest number of user clicks/purchases on the item's tag as user-side features?

23 June 2024 3,614 1 View

The primary problem that LLMs solved is small sample learning, right?

The primary problem that large language models solved is small sample learning, right?

04 June 2024 3,326 2 View

based on GPT-3, Text generation really solves all tasks, or solves only text generation?

based on GPT-3

02 June 2024 5,466 3 View

ChatGPTs can really replace search engine?

Or they are only complement to each other

02 June 2024 6,042 3 View

Recommender system, which is more important, pay-AUC or click-AUC?

For example, if offline click-AUC improves from 0.77 to 0.82 VS pay-AUC improves from 0.88 to 0.91, which online gain will be greater?

22 May 2024 3,581 1 View

In deep learning, images can be encoded in pixels, how is audio encoded?

It seems it is also through several dimensions.

14 May 2024 8,119 3 View

[Swin-Transformer] Is each token (to-embedding) value an integer?

Swin-Transformer transform the image to tokens to input to transformer. Is each token (before-embedding) value an integer? In practice, where is this done?...

14 May 2024 1,140 2 View

Do you have to use paper and pen to do physics research?

Do you must use paper and pen to do physics research?

09 May 2024 1,216 4 View

Tagging on the item would theoretically solve all recall problems for searching. Do you agree?

Tagging on the item means adding related tags on the item for searching.

04 March 2024 1,142 2 View

Is there some evaluation in Arabic for stanfordnlp CoreNLP vs ElastcSearch default segmenter?

For word segmentation. Thank you very much!

04 March 2024 7,642 1 View

What is the best Arabic word segmentation tool in order to use in search engines?

I have a search engine based on ElasticSearch. Thank you very much!

04 March 2024 8,214 2 View

Why the best way to learn math is to do math ?

19 February 2024 3,968 3 View

What is the essential difference between "World Model" and "Reward Is Enough"?

Yann LeCun --> World Model Deepmind -->《Reward Is Enough》

19 February 2024 1,994 0 View

"machine learning from zero" is《the bitter lesson》?

learning like a child/baby

16 February 2024 8,706 1 View

Machine learning is to predict unseen data, but LLM is to memorize all the right data?

large language model

16 February 2024 1,350 1 View

"learn how to learn" means searching the reward automatically?

Human is searching the reward to verify some questions, while human predict the answers based on large learned memory.

16 February 2024 6,786 1 View

Why transformer is better than CNN in image tasks?

such as image classification

16 February 2024 1,602 2 View

What are some milestones in AI4math?

solving math by AI

16 February 2024 2,762 0 View

Search Algorithms, for short-text/keywords query, How much of an improvement is the model over the rules?

Not long-text query

04 February 2024 9,015 1 View

Are there any problems with search algorithms that use 2gram to split text?

04 February 2024 2,486 1 View

AlphaGo can surpass humans because for each input of the model in training, there is a 100% correct answer?

AlphaGo can surpass humans because for each input of the model, there is a 100% correct answer as the target label? And humans will make mistakes in situations like 1%.

16 January 2024 1,126 1 View

Neural network is actually part of the software2.0 ?

Data is part of the code. Neural network is actually code for fuzzy match.

12 January 2024 553 3 View

If I prepare the hardware myself, what are some good resources for doing robotics research?

11 January 2024 4,664 0 View

Writing papers with overly oblique and overly specialised words that don't really make sense?

27 December 2023 1,596 2 View

What capabilities does ChatGPT lack to become an AI-doctor/AI-lawyer/AI-teacher?

If ChatGPT wants to be an AI-teacher/AI-lawyer/AI-doctor, what important capabilities does it lack?

25 December 2023 5,201 2 View

The most important thing in ChatGPT's conditions for becoming an AI-doctor/AI-lawyer/AI-teacher is the accuracy of the model results?

25 December 2023 7,608 2 View

Why can't we use one image to predict the next, along the ideas of GPT?

Why can't we use one image to predict the next, along the lines of GPT?

20 December 2023 4,908 3 View

When is it better to program with CUDA?

13 December 2023 6,484 1 View

What’s the difficulty in implementing a robotic arm to pick up a glass of water?

24 November 2023 2,711 0 View

How much has mathematics directly contributed to the development of deep learning?

Do you feel that deep learning is mainly an engineering contribution?

11 November 2023 7,216 1 View

Are large language models relatively unsuitable for high-precision tasks?

Are LLMs relatively unsuitable for high-precision tasks?

10 November 2023 6,512 2 View

The method is simple and effective. How to write a CS paper that will help it be accepted?

The method is simple and effective. How to write a computer science paper that will help it be accepted?

31 October 2023 8,052 1 View

The method is simple and effective. How to write an AI paper that will help it be accepted?

31 October 2023 6,163 4 View

Are computer science papers generally not as complex as mathematics papers?

31 October 2023 2,187 2 View

For computer science, can some methods in the paper be written without experiments, just theoretical analysis of the results?

30 October 2023 8,819 10 View

Is writing papers and actual programming two different fields of knowledge?

How big is the difference between what is written in many AI papers and its real code?

23 October 2023 1,105 1 View

Do all deep learning solve the similarity of things?

26 September 2023 8,193 7 View

What is the principle that allows transformers to learn super-long sequences?

26 September 2023 4,529 3 View

What problem is theoretical deep learning trying to solve?

26 September 2023 2,103 1 View

The meticulous and careful manual annotation one by one is irreplaceable?

This part seems extremely difficult to optimize.

26 September 2023 8,974 1 View

Is data augmentation, which creates something from nothing, substantially reliable?

Data augmentation creates something from nothing?

25 September 2023 4,195 4 View

What are some substantial and reliable advancements in data augmentation?

Data augmentation creates something from nothing?

25 September 2023 9,286 4 View

What percentage of the rise of deep learning in 2012 is due to mathematical contributions, and what percentage is due to engineering contributions?

20 September 2023 586 0 View

Is it inevitable that pre-training + few-shot learning will not be as good as sufficient data in a specific field?

Less training data, Less model performance. Is it inevitable that pre-training + few-shot learning will not be as good as sufficient data in a specific field?

09 September 2023 4,413 4 View

Small sample learning, why is it called Few-Shot Learning, not Few-Data Learning?

08 September 2023 802 1 View

Please list the top conference papers on AI you have read, in which large sections of mathematics have played a key role?

08 September 2023 3,699 0 View

The essential result of GPT pre-training is sentence similarity, right?

universal sentence similarity

08 September 2023 9,116 1 View

How to accurately define whether an AI paper is solid?

07 September 2023 8,520 2 View

Some of the major mathematical sections in top AI conference papers are hard to understand, what should I do?

For example: 《Efficient Second-Order Plane Adjustment》

07 September 2023 8,987 0 View

How can artificial intelligence break through the existing deep learning/neural network framework, and what are the directions?

07 September 2023 9,823 1 View

Has OpenAI released any solutions or approaches for task-oriented dialogue?

03 September 2023 5,812 2 View

If computing power is further improved, can computer vision achieve the 'emergent capability' of ChatGPT?

03 September 2023 9,537 3 View

What are the differences in task-oriented dialogue before and after the release of ChatGPT?

03 September 2023 899 1 View

If each NLP task has an accuracy of 90%, after integrating them into the LLM, the accuracy of each NLP task becomes 85%, right?

If each NLP task has an accuracy of 90%, after integrating them into the large language model, the accuracy of each NLP task becomes 85%, right?

25 August 2023 857 1 View

There must be some negative interactions between multiple NLP tasks in a LLM, right?

For example, if the accuracy of each NLP task is 90%, after integrating them into a large language model, the accuracy of each NLP task becomes 85%.

25 August 2023 7,638 1 View

How to establish a network with IEEE members?

How to establish a network with IEEE members that can help become an IEEE Fellow?

23 August 2023 9,754 1 View

How to do a computer science paper with a title starting with "Revisit"?

《Revisiting ...》

23 August 2023 5,537 2 View

If the paper is 1 page less than the maximum required limit, does it have a big impact on its acceptance?

For example, if the maximum requirement is 6 pages of main content, but my paper only has 5 pages.

23 August 2023 6,661 3 View

What are some comparison articles between fine-tuning LLM and fine-tuning BERT under supervised learning?

LLM with >= 6B parameters vs BERT-Large/BERT-Base

26 July 2023 9,432 1 View

How much does writing affect the acceptance of a paper?

If a paper is innovative but written poorly, but the ideas are clearly expressed, what is the likelihood of it being accepted?

26 July 2023 5,099 4 View

What are the most important points in writing a paper? Do we need to strictly follow the format of previously accepted papers?

Thank you

26 July 2023 5,825 4 View

A question bother me a long time: What is the difference between RL-for-text-generation and delete-0-reward-model-predictions?

For text gereration. Thank you very much!

16 March 2023 505 1 View

What are the pretrained-language-model that is obviously better than BERT and RoBERTa?

The BERT is described in the paper 《BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding》. The RoBERTa is described in the paper 《RoBERTa: A Robustly Optimized BERT...

20 January 2021 1,043 2 View

Based on transformer, how to improve the text generation results?

If I do not pretrain the text generation model like BART, how to improve the result based on transformer like tensor2tensor? What are the improvement ideas for transformer in text generation task?

19 August 2020 7,476 3 View

Are there any deep models that are better than BERT-CRF in NER task?

Named entity recognition (NER) is task that mark tags of the input text sequence. BERT-CRF is a good NER model. I want to find a better NER model. Or I want to improve the BERT-CRF model. What...

19 August 2020 7,337 2 View

Question of pretraining text-generation task, it seems that pretraining is not work for a small model?

My task is to generate keywords from sentences. I pretrain a text-generation model. I mask the sentences' tokens and predict the whole sentences' tokens. Pretraining batch_size = 8 and step =...

29 July 2020 5,875 1 View

What is the principle of Unsupervised Data Augmentation (UDA)? Why does UDA work?

UDA(https://github.com/google-research/uda) could achieve good accuracy by only 20 training data on text classification. But I find it is hard to reproduce the result on my own dataset. So I...

06 June 2020 1,686 2 View

What are the simplest methods for the label noise problem?

If I have enough low quality data from unsupervised methods or rule-based methods. Do you think removing the wrong data predicted by trained model is a simple but effective method?

03 June 2020 4,264 3 View

Data quantity is not low but data quality is low, what are the best practices now?

Text classification task, if data quantity is low but data quality is not low. We could use data augment methods for improvement. But the situation is that data quantity is not low and data...

02 June 2020 3,212 15 View

What is the model architectural difference between transductive GCN and inductive GraphSAGE?

Difference of the model design. It seems the difference is that GraphSAGE sample the data. But what is the difference in model architecture.

06 May 2020 7,952 3 View

What is difference between transductive and inductive in GNN?

It seems in GNN(graph neural network), in transductive situation, we input the whole graph and we mask the label of valid data and predict the label for the valid data. But is seems in inductive...

06 May 2020 4,696 3 View

Which kind of model is better for keyword-set classification?

There exists a similar task that is named text classification. But I want to find a kind of model that the inputs are keyword set. And the keyword set is not from a sentence. For example: input...

26 December 2019 4,558 3 View

Which kind of model is better for keyword-set classification?

There exists a similar task that is named text classification. But I want to find a kind of model that the inputs are keyword set. And the keyword set is not from a sentence. For example: input...

26 December 2019 2,747 6 View

NL2SQL, for real industrial application, what strategy to locate the exact table?

The datasets like WikiSQL is that the table corresponding to question is given. But in real industrial application, we have 100+ tables for 1 new question. Thank you!

10 December 2019 4,964 0 View

Is there room to improve the model? If the train data accuracy is 99.8% but test data accuracy is 90%?

I understand this is a wide question. But there can be some suggestions. I can try some methods which I do not know. I think the model is already prefect on train data. But the test accuracy is...

12 October 2019 5,295 5 View

NMT, What if we do not pass input for decoder?

For transformer-based neural machine translation (NMT), take English-Chinese for example, we pass English for encoder and use decoder input(Chinese) attend to encoder output, then final output....

16 September 2019 7,477 1 View

What is the difference between attend and attention?

Attention is the mechanism described in the paper: "Attention Is All You Need". Attend is an operation of Tensorflow or PyTorch.

12 September 2019 7,083 4 View

Why TREC set two task: document ranking and passage ranking?

TREC is https://microsoft.github.io/TREC-2019-Deep-Learning/ I am new to text retrieval. Still can not understand why set the two similar task. Thank you very much.

06 August 2019 1,270 3 View

What is the most popular loss for doc ranking and What is the most popular loss for text similarity?

Based on my understanding, both the doc ranking task and text similarity task take sentence pairs as model input. We use different loss to get better result for each of them. Thank you very much.

05 August 2019 3,697 4 View

What can NLI do for a chatbot?

Natural Language Inference(NLI) is the task of predicting the labels(entailment, contradiction, and neutral,) for sentence pairs. People invent a lot of deep model to solve this problem. But I...

05 August 2019 4,786 3 View

For text match problem, what is the different between question-question match and question-answer match?

I know question-question match is a text similarity problem. What about question-answer match or question-doc match? It is used in information retrieval. question-question match is indeed text...

03 August 2019 8,857 3 View

How to prepare the data for text generation task?

First, I'm not sure whether the model contains the encoder during training. EOS means end-of-sentence. Encoder and decoder are part of transformer network. If without-encoder, training...

23 March 2019 9,373 2 View

The principle of LM deep model?

Language model(LM) is the task of predicting the next word. Does the deep model need the encoder? From the ptb code of tensor2tensor, I find the deep model do not contains the encoder. Or both...

22 March 2019 9,718 2 View

What is the difference between TextGAN and LM for text generation?

I'm new to LeakGAN or SeqGAN or TextGAN. I know GAN is to generate text and let discriminator un-judge-able to real text and gen-text. LM(language model) is the task of predicting the next word...

11 March 2019 4,625 5 View

What is the reason for the speedup of transformer-xl?

The inference speed of transformer-xl is faster than transformer. Why? If state reuse is the reason, so it is compared by 2 32seq_len + state-reuse vs 1 64seq_len + no-state-reuse?

25 February 2019 9,297 3 View

In ChatGPT, what is the difference between Reinforcement-Learning-from-Human-Feedback and Data-Re-Label-from-Reward?

RLHF vs TrainingData-Label-Again-based-on-Reward. Reward come from human labeling.

01 January 1970 6,888 3 View

How can a low-level employee working in an IT company, without supervising students like a university professor, become an IEEE fellow?

01 January 1970 7,953 1 View

How to become an IEEE fellow while working in a company without being a university professor?

01 January 1970 9,775 0 View

In ChatGPT, can we remove the reward-model part?

We collected the [good]/[bad] feedback from the web page. Then we remove the [bad] feedback data. Then we only use the [good] feedback data to train the text-generation policy-model. The [good]...

01 January 1970 5,057 3 View

What to research? except data-edit or data-load for LLM

LLM = large language model

01 January 1970 372 2 View

Deep learning now is solving the problem of AI-agent remembering. Is it right?

Deep learning want to have generalization ability. And now deep learning is solving the problem of AI-agent remembering. Is it right?

01 January 1970 512 5 View

How can independent researchers become an IEEE fellow without supervising students?

01 January 1970 8,125 1 View

What is the different between Reinforcement-Learning-On-NLP and Re-Label-That-Data

Reinforcement-Learning-On-NLP means that using reward to update model. Re-Label-That-Data means using reward to label-again the related data and then re-train.

01 January 1970 2,898 3 View

ChatGPT's core function is to fuzzy-match all the texts in the world to a small set of texts.

Do you agree?

01 January 1970 9,006 2 View

In reinforcement learning, there are no so-called 'reward', there are only 'positive and negative' collected data.

Is it right?

01 January 1970 8,096 4 View

For ChatGPT，if you can collect all the possible pre-train data, then you can just remove the bad-feedback data from predictions.

For ChatGPT，if you can collect all the possible pre-train data, then you can just remove the bad-feedback data from predictions for reward model. if you can not collect all the possible pre-train...

01 January 1970 3,858 2 View

Is there a way to become an IEEE fellow without becoming a doctoral supervisor at a university?

01 January 1970 5,081 1 View

GPT save the candidate to-label data into big model, so to simplify the labeling difficulty. The labeler originally need to ...

GPT save the candidate to-label data into big model, so to simplify the labeling difficulty. The labeler originally need to write the whole answer by themselves.

01 January 1970 7,324 0 View

For ChatGPT, there is no essential difference between reinforce learning and supervised learning, here.

For ChatGPT, human-feedback's goal is to fix the wrong data in policy-model's dataset. There is no essential difference between reinforce learning and supervised learning, here. Is it right?

01 January 1970 7,720 0 View

Is there a promising future for someone over 40 years old who is still writing code and has not become a manager?

Is there a promising future for someone over 40 years old who is still writing code, and has not become a manager in IT company?

01 January 1970 719 2 View

Why let the machine learn to think, think may be right or wrong. How about just let the machine memorize all the correct answers?

The bottleneck of LLMs is that it is actually impossible to label all knowledge? DeepLearning/LLMs are ultimately efficiency problem of data production?

01 January 1970 3,338 3 View

Why are top researchers all studying theoretical deep learning?

01 January 1970 9,373 1 View

For physics, is mathematics more of a tool or a language?

01 January 1970 863 3 View

Do you know any scholars/researchers who, without a PhD, ended up as researchers at an institute?

01 January 1970 3,529 3 View

Mathematics is and is only a language and a tool, not all of science?

01 January 1970 959 2 View

ChatGPT is not intelligence, but memory?

Next-token-prediction is not intelligence, but memory?

01 January 1970 1,751 1 View

Isn't this how humans learn? First remember some things, then make some guesses about new things based on existing memories, just like a NN?

Isn't this how humans learn? First remember some things, then make some guesses about new things based on existing memories, just like a neural network? So, do you feel that the current path of...

01 January 1970 8,024 6 View

How many high-quality papers on average are required to become an IEEE fellow?

01 January 1970 9,761 0 View

How big is the difference between what is written in many AI papers and its real code?

So, is writing papers and actual programming two different fields of knowledge?

01 January 1970 9,425 0 View

Humans first remember something, then make some guess about new thing based on memory, just like NN, so do you feel that deep learning can lead to AGI

Humans first remember some things, then make some guesses about new things based on memory, just like neural networks, so do you feel that deep learning can lead to AGI (Artificial General...

01 January 1970 3,840 1 View

DeepLearning/LLMs are ultimately efficiency problem of data production?

Why let the machine learn to think, think may be right or wrong. How about just let the machine memorize all the correct answers? The bottleneck of LLMs is that it is actually impossible to label...

01 January 1970 8,496 1 View

Is LLM/ChatGPT actually moving further and further away from AlphaGo-style AI?

01 January 1970 6,746 3 View

Do you know any scholars/researchers who, without a PhD, ended up becoming university professors?

01 January 1970 1,975 0 View

The bottleneck of LLMs is that it is actually impossible to label all data?

DeepLearning/LLMs are ultimately efficiency problem of data production? Why let the machine learn to think, think may be right or wrong. How about just let the machine memorize all the correct...

01 January 1970 7,941 0 View

For computer science, is mathematics more of a tool or a language?

01 January 1970 3,188 3 View

For ChatGPT, human-feedback's goal is to fix the wrong data in policy-model's dataset.

For ChatGPT, human-feedback's goal is to fix the wrong data in policy-model's dataset. There is no essential difference between reinforce learning and supervised learning, here. Is it right?

01 January 1970 7,222 1 View

The evaluation system in academia is fairer than in the industry, right?

01 January 1970 526 1 View

University teachers have jobs such as teaching, so what percent of their time is spent on research?

01 January 1970 4,888 12 View

Is the English present-tense/past-tense/singular-plural/third-person-singular purely redundant?

We can totally get the sentence meaning without them.

01 January 1970 9,872 2 View

We do the WORLD MODEL research, but in the end, we still have to let large model memorize all the correct answers?

Researching world model + reinforcement learning, and in the end realize that we still need to label a lot of data?

01 January 1970 6,779 3 View

Is it reasonable for engineering research to focus on academic papers?

for example, computer science

01 January 1970 9,970 2 View

The real benefit of large language model is its big capability, not benefit of few-shot learning ability?

The more benefit of large language model is its big capability, not benefit of few-shot learning ability?

01 January 1970 3,180 3 View

Rethink neural network: it seems deep-learning = deep-memory?

memorize-ability > generalize-ability

01 January 1970 1,609 3 View

Are computer science papers generally not as profound as mathematics papers?

01 January 1970 6,487 1 View

For computer science, sometimes, writing can elevate a paper to a very high level. Right?

01 January 1970 5,487 1 View

Predicting, will AGI ultimately be derived from mathematical derivation, or from engineering experiments?

01 January 1970 9,510 0 View

Why didn't anyone at NVIDIA win a Turing Award?

Why didn't anyone at NVIDIA company win a Turing Award?

01 January 1970 2,963 0 View