It would be good to know with how many potential explanatory variables you are beginning. But I will assume it is many, given your comments about sparsity and multiple interactions.
If by "separation problem" you mean fitting issues near the boundaries of parameter spaces, e.g. when you have nearly perfect predictions, I know of 3 main approaches for dealing with this in logistic regression: (1) regularize; (2) focus on likelihood-ratio (LR) tests; (3) use exact methods.
(1) You might look into elastic-net regularized Cox PH regression, such as implemented in the coxnet function in the R package glmnet. As you probably know you can trick Cox PH regression, usually used for survival data, into fitting the conditional logistic regression model for matched case-control data. (See e.g. function clogit in the R package survival.) In addition to the other benefits of regularization - avoiding over-fitting and improving predictions in particular - you get a solution to separation problems for free.
I don't, however, know how well it will work with very sparse data. There ARE variations of this approach specifically designed for sparse solutions but that, I believe, is a different issue. Moreover, the interactions in your model present something of an additional problem. Usually you would not want main effects for a given variable to be dropped while retaining interactions involving that variable. To achieve that constraint you can in principle use "grouped regularization". Alas, I am not aware of publicly available software that gives you that option. (If there is some, I would love to hear about it!)
Also, you don't mention the goal of your analysis. For prediction, this approach is good. Doing inference using such approaches is possible but relatively undeveloped; I think Tibshirani and colleagues, among others, might have some recent publications in that area.
(2) IF you have few enough variables that regularization is not crucial, you might, depending on your goals, be able to deal with the separation problem by using likelihood-ratio tests. If I recall correctly, the LR test is OK with separation issues, even while Wald tests are not.
(3) Finally, you might consider using so-called EXACT conditional logistic regression, which should help with both the sparsity and separation problems. I know this is available in SAS and logXact, though they might choke on a sample of size 1200 (I'm not sure). In R, you might be able to use the elrm package, IF you have matched pairs. With pairs, you can convert the problem into (unconditional) logistic regression, which is what elrm gives you; see e.g. [ Breslow, N. E. (1982), “Covariance Adjustment of Relative-Risk Estimates in Matched Studies,” Biometrics, 38, 661–672 ], and/or - dare I say it - the SAS manual, http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_logistic_sect062.htm .
You are talking about a Classification problem? Generally, SVM (Support Vector Machine) generally gives a good result. Also, there are sparse deviations of SVM with L-1 norm.
It would be good to know with how many potential explanatory variables you are beginning. But I will assume it is many, given your comments about sparsity and multiple interactions.
If by "separation problem" you mean fitting issues near the boundaries of parameter spaces, e.g. when you have nearly perfect predictions, I know of 3 main approaches for dealing with this in logistic regression: (1) regularize; (2) focus on likelihood-ratio (LR) tests; (3) use exact methods.
(1) You might look into elastic-net regularized Cox PH regression, such as implemented in the coxnet function in the R package glmnet. As you probably know you can trick Cox PH regression, usually used for survival data, into fitting the conditional logistic regression model for matched case-control data. (See e.g. function clogit in the R package survival.) In addition to the other benefits of regularization - avoiding over-fitting and improving predictions in particular - you get a solution to separation problems for free.
I don't, however, know how well it will work with very sparse data. There ARE variations of this approach specifically designed for sparse solutions but that, I believe, is a different issue. Moreover, the interactions in your model present something of an additional problem. Usually you would not want main effects for a given variable to be dropped while retaining interactions involving that variable. To achieve that constraint you can in principle use "grouped regularization". Alas, I am not aware of publicly available software that gives you that option. (If there is some, I would love to hear about it!)
Also, you don't mention the goal of your analysis. For prediction, this approach is good. Doing inference using such approaches is possible but relatively undeveloped; I think Tibshirani and colleagues, among others, might have some recent publications in that area.
(2) IF you have few enough variables that regularization is not crucial, you might, depending on your goals, be able to deal with the separation problem by using likelihood-ratio tests. If I recall correctly, the LR test is OK with separation issues, even while Wald tests are not.
(3) Finally, you might consider using so-called EXACT conditional logistic regression, which should help with both the sparsity and separation problems. I know this is available in SAS and logXact, though they might choke on a sample of size 1200 (I'm not sure). In R, you might be able to use the elrm package, IF you have matched pairs. With pairs, you can convert the problem into (unconditional) logistic regression, which is what elrm gives you; see e.g. [ Breslow, N. E. (1982), “Covariance Adjustment of Relative-Risk Estimates in Matched Studies,” Biometrics, 38, 661–672 ], and/or - dare I say it - the SAS manual, http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_logistic_sect062.htm .