Topic modelling can be based on whatever unit of text is relevant for you. Take LDA (https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation) for instance: instead of modelling the distribution of words for a topic, one can model the distribution of n-grams for a topic. The same is true for LSA/LSI (https://en.wikipedia.org/wiki/Latent_semantic_analysis) and NMF (https://en.wikipedia.org/wiki/Non-negative_matrix_factorization#Text_mining): instead of a term - document matrix, you can build a a "n-gram - document" matrix. Then the computations remain the same.
In practice, Sklearn's vectorizers (http://scikit-learn.org/stable/modules/classes.html#module-sklearn.feature_extraction.text) can work at the n-gram level through the ngram_range parameter. The resulting matrix can be used as an input for any topic modelling procedure (see http://scikit-learn.org/stable/auto_examples/applications/plot_topics_extraction_with_nmf_lda.html#sphx-glr-auto-examples-applications-plot-topics-extraction-with-nmf-lda-py).
Topic modelling can be based on whatever unit of text is relevant for you. Take LDA (https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation) for instance: instead of modelling the distribution of words for a topic, one can model the distribution of n-grams for a topic. The same is true for LSA/LSI (https://en.wikipedia.org/wiki/Latent_semantic_analysis) and NMF (https://en.wikipedia.org/wiki/Non-negative_matrix_factorization#Text_mining): instead of a term - document matrix, you can build a a "n-gram - document" matrix. Then the computations remain the same.
In practice, Sklearn's vectorizers (http://scikit-learn.org/stable/modules/classes.html#module-sklearn.feature_extraction.text) can work at the n-gram level through the ngram_range parameter. The resulting matrix can be used as an input for any topic modelling procedure (see http://scikit-learn.org/stable/auto_examples/applications/plot_topics_extraction_with_nmf_lda.html#sphx-glr-auto-examples-applications-plot-topics-extraction-with-nmf-lda-py).
Yes of course. I have been using n-grams for text analysis using LingPipe which worked really well for me. You might consider tokenization as pre-processing step.