A crucial question in data science is to extract meaningful information embedded in high-dimensional data. Such information is often formed into a low-dimensional space with a set of features that can represent the original data at different levels. Wavelet analysis is a pervasive method for decomposing time-series signals into a few levels with detailed temporal resolution. However, the wavelets after decomposition are intertwined and could be over-represented across levels for each sample and across different samples within one population. In this work, using simulated spikes, experimental neural spikes and calcium imaging signals, and human electrocortigraphic signals, we leveraged conditional mutual information between wavelets for feature selection. The meaningfulness of selected features was verified to decode stimulus or condition from dynamic neural responses. We demonstrated that decoding with only a small set of these features can achieve high decoding. These results provide a new way of wavelet analysis for extracting essential features of the dynamics of spatiotemporal neural data, which then enables to support novel model design of machine learning with representative features.
Clinical studies sometimes encounter truncation by death, rendering some outcomes undefined. Statistical analysis based solely on observed survivors may give biased results because the characteristics of survivors differ between treatment groups. By principal stratification, the survivor average causal effect was proposed as a causal estimand defined in always-survivors. However, this estimand is not identifiable when there is unmeasured confounding between the treatment assignment and survival or outcome process. In this paper, we consider the comparison between an aggressive treatment and a conservative treatment with monotonicity on survival. First, we show that the survivor average causal effect on the conservative treatment is identifiable based on a substitutional variable under appropriate assumptions, even when the treatment assignment is not ignorable. Next, we propose an augmented inverse probability weighting (AIPW) type estimator for this estimand with double robustness. Finally, large sample properties of this estimator are established. The proposed method is applied to investigate the effect of allogeneic stem cell transplantation types on leukemia relapse.
Deep learning has been successfully applied for predicting asset prices using financial time series data. However, image-based deep learning models excel at extracting spatial information from images and their potential in financial applications has not been fully explored. Here we propose a new model---channel and spatial attention convolutional neural network (CS-ACNN)---for price trend prediction that takes arbitrary images constructed from financial time series data as input. The model incorporates attention mechanisms between convolutional layers to focus on specific areas of each image that are the most relevant for price trends. CS-ACNN outperforms benchmarks on exchange-traded funds (ETF) data in terms of both model classification metrics and investment profitability, achieving out-of-sample Sharpe ratios ranging from 1.57 to 3.03 after accounting for transaction costs. In addition, we confirm that the images constructed based on our methodology lead to better performance when compared to models based on traditional time series data. Finally, the model learns visual patterns that are consistent with traditional technical analysis, providing an economic rationale for learned patterns and allowing investors to interpret the model.
We propose several methods to obtain endogenous and positive ultimate forward rates (UFRs) for risk-free interest rate curves based on the Smith-Wilson method. The Smith-Wilson method, adopted by Solvency II, can both interpolate the market price data and extrapolate to the UFR. However, it requires an exogenously-chosen UFR. de Kort and Vellekoop (2016) proposed an optimization problem to obtain an endogenous UFR. In this paper, we prove the existence of the optimal endogenous UFR to their optimization problem. In addition, in order to ensure the positiveness of the optimal UFR, we formulate a new optimization framework with nonnegative constraints. Furthermore, we also propose another optimization framework to generate endogenous and positive UFRs with prior knowledge. The feasibilities of both methods are proven under several mild conditions. We use Chinese government bond data to illustrate the capabilities of our methods and find the dynamic behaviour of Chinese risk-free interest rate curves.