A way to fight your traffic tickets. The paper was awarded a special prize of
$400 that the author did not have to pay to the state of California.
<br />In view of enormous, extremely surprising and completely unexpected public
interest to this work, we have added an appendix answering the two most common
questions.
This Chapter is written for the Festschrift celebrating the 70th birthday of
the distinguished economist Duncan Foley from the New School for Social
Research in New York. This Chapter reviews applications of statistical physics
methods, such as the principle of entropy maximization, to the probability
distributions of money, income, and global energy consumption per capita. The
exponential probability distribution of wages, predicted by the statistical
equilibrium theory of a labor market developed by Foley in 1996, is supported
by empirical data on income distribution in the USA for the majority (about
97%) of population. In addition, the upper tail of income distribution (about
3% of population) follows a power law and expands dramatically during financial
bubbles, which results in a significant increase of the overall income
inequality. A mathematical analysis of the empirical data clearly demonstrates
the two-class structure of a society, as pointed out Karl Marx and recently
highlighted by the Occupy Movement. Empirical data for the energy consumption
per capita around the world are close to an exponential distribution, which can
be also explained by the entropy maximization principle.
FuturICT foundations are social science, complex systems science, and ICT.
The main concerns and challenges in the science of complex systems in the
context of FuturICT are laid out in this paper with special emphasis on the
Complex Systems route to Social Sciences. This include complex systems having:
many heterogeneous interacting parts; multiple scales; complicated transition
laws; unexpected or unpredicted emergence; sensitive dependence on initial
conditions; path-dependent dynamics; networked hierarchical connectivities;
interaction of autonomous agents; self-organisation; non-equilibrium dynamics;
combinatorial explosion; adaptivity to changing environments; co-evolving
subsystems; ill-defined boundaries; and multilevel dynamics. In this context,
science is seen as the process of abstracting the dynamics of systems from
data. This presents many challenges including: data gathering by large-scale
experiment, participatory sensing and social computation, managing huge
distributed dynamic and heterogeneous databases; moving from data to dynamical
models, going beyond correlations to cause-effect relationships, understanding
the relationship between simple and comprehensive models with appropriate
choices of variables, ensemble modeling and data assimilation, modeling systems
of systems of systems with many levels between micro and macro; and formulating
new approaches to prediction, forecasting, and risk, especially in systems that
can reflect on and change their behaviour in response to predictions, and
systems whose apparently predictable behaviour is disrupted by apparently
unpredictable rare or extreme events. These challenges are part of the FuturICT
agenda.
We investigate the structure of the profit landscape obtained from the most
basic, fluctuation based, trading strategy applied for the daily stock price
data. The strategy is parameterized
9c5
by only two variables, p and q. Stocks are
sold and bought if the log return is bigger than p and less than -q,
respectively. Repetition of this simple strategy for a long time gives the
profit defined in the underlying two-dimensional parameter space of p and q. It
is revealed that the local maxima in the profit landscape are spread in the
form of a fractal structure. The fractal structure implies that successful
strategies are not localized to any region of the profit landscape and are
neither spaced evenly throughout the profit landscape, which makes the
optimization notoriously hard and hypersensitive for partial or limited
information. The concrete implication of this property is demonstrated by
showing that optimization of one stock for future values or other stocks
renders worse profit than a strategy that ignores fluctuations, i.e., a
long-term buy-and-hold strategy.
The aim of this article is to briefly review and make new studies of
correlations and co-movements of stocks, so as to understand the
"seasonalities" and market evolution. Using the intraday data of the CAC40, we
begin by reasserting the findings of Allez and Bouchaud [New J. Phys. 13,
025010 (2011)]: the average correlation between stocks increases throughout the
day. We then use multidimensional scaling (MDS) in generating maps and
visualizing the dynamic evolution of the stock market during the day. We do not
find any marked difference in the structure of the market during a day. Another
aim is to use daily data for MDS studies, and visualize or detect specific
sectors in a market and periods of crisis. We suggest that this type of
visualization may be used in identifying potential pairs of stocks for "pairs
trade".
We investigate the possible drawbacks of employing the standard Pearson
estimator to measure correlation coefficients between financial stocks in the
presence of non-stationary behavior, and we provide empirical evidence against
the well-established common knowledge that using longer price time series
provides better, more accurate, correlation estimates. Then, we investigate the
possible consequences of instabilities in empirical correlation coefficient
measurements on optimal portfolio selection. We rely on previously published
works which provide a framework allowing to take into account possible risk
underestimations due to the non-optimality of the portfolio weights being used
in order to distinguish such non-optimality effects from risk underestimations
genuinely due to non-stationarities. We interpret such results in terms of
instabilities in some spectral properties of portfolio correlation matrices.
Prediction markets show considerable promise for developing flexible
mechanisms for machine learning. Here, machine learning markets for
multivariate systems are defined, and a utility-based framework is established
for their analysis. This differs from the usual approach of defining static
betting functions. It is shown that such markets can implement model
combination methods used in machine learning, such as product of expert and
mixture of expert approaches as equilibrium pricing models, by varying agent
utility functions. They can also implement models composed of local potentials,
and message passing methods. Prediction markets also allow for more flexible
combinations, by combining multiple different utility functions. Conversely,
the market mechanisms implement inference in the relevant probabilistic models.
This means that market mechanism can be utilized for implementing parallelized
model building and inference for probabilistic modelling.
The aim of this paper is twofold: to provide a theoretical framework and to
give further empirical support to Shiller's test of the appropriateness of
prices in the stock market based on the Cycli
b14
cally Adjusted Price Earnings
(CAPE) ratio. We devote the first part of the paper to the empirical analysis
and we show that the CAPE is a powerful predictor of future long run
performances of the market not only for the U.S. but also for countries such as
Belgium, France, Germany, Japan, the Netherlands, Norway, Sweden and
Switzerland. We show four relevant empirical facts: i) the striking ability of
the logarithmic averaged earning over price ratio to predict returns of the
index, ii) how this evidence increases switching from returns to gross returns,
iii) moving over different time horizons, the regression coefficients are
constant in a statistically robust way, and iv) the poorness of the prediction
when the precursor is adjusted with long term interest rate. In the second part
we provide a theoretical justification of the empirical observations. Indeed we
propose a simple model of the price dynamics in which the return growth depends
on three components: a) a momentum component, naturally justified in terms of
agents' belief that expected returns are higher in bullish markets than in
bearish ones; b) a fundamental component proportional to the log earnings over
price ratio at time zero, from which the actual stock price may deviate as an
effect of random external disturbances, and c) a driving component ensuring the
diffusive behaviour of stock prices. Under these assumptions, we are able to
prove that, if we consider a sufficiently large number of periods, the expected
rate of return and the expected gross return are linear in the initial time
value of the log earnings over price ratio, and their variance goes to zero
with rate of convergence equal to minus one.
In this paper the complex-valued bes
528
t linear unbiased estimator of an unknown
constant mean of white noise was derived the ordinary least-squares estimator
of an unknown constant mean of random field (arithmetic mean) charged by an
imaginary error.
Financial markets are well known examples of multi-fractal complex systems
that have garnered much interest in their characterization through complex
network theory. The recent studies have used correlation based distance metrics
for defining and analyzing financial networks. In this work the singularity
strength is employed to define a distance metric and the existence of
hierarchical structure in the Johannesburg Stock Exchange is investigated. The
multi-fractal nature of the financial market, which is otherwise hidden in the
correlation coefficient based prescriptions, is analyzed through the use of the
singularity strength based method. The presence of a super cluster is exhibited
in the network which accounts for half of the network size and is homogeneous
in the sectoral composition of the South African market.
We derive explicit recursive formulas for Target Close (TC) and
Implementation Shortfall (IS) in the Almgren-Chriss framework. We explain how
to compute the optimal starting and stopping times for IS and TC, respectively,
given a minimum trading size. We also show how to add a minimum participation
rate constraint (Percentage of Volume, PVol) for both TC and IS. We also study
an alternative set of risk measures for the optimisation of algorithmic trading
curves. We assume a self-similar process (e.g. L\'evy process, fractional
Brownian motion or fractal process) and define a new risk measure, the
$p$-variation, which reduces to the variance if the process is a Brownian
motion. We deduce the explicit formula for the TC and IS algorithms under a
self-similar process. We show that there is an equivalence between self-similar
models and a family of risk measures called $p$-variations: assuming a
self-similar process and calibrating empirically the parameter $p$ for the
$p$-variation yields the same result as assuming a Brownian motion and using
the $p$-variation as risk measure instead of the variance. We also show that
$p$ can be seen as a measure of the aggressiveness: $p$ increases if and only
if the TC algorithm starts later and executes faster. From the explicit
expression of the TC algorithm one can compute the sensitivities of the curve
with respect to the parameters up to any order. As an example, we compute the
first order sensitivity with respect to both a local and a global surge of
volatility. Finally, we show how the parameter $p$ of the $p$-variation can be
implied from the optimal starting time of TC, and that under this framework $p$
can be viewed as a measure of the joint impact of market impact (i.e.
liquidity) and volatility.
Recently, many studies indicated that the minimum spanning tree (MST) network
whose metric distance is de?ned b
979
y using correlation coe?cients have strong
implications on extracting infor- mation from return time series. However in
many cases researchers may hope to investigate the strength of interactions but
not the directions of them. In order to study the strength of interaction and
connection of ?nancial asset returns we propose a modi?ed minimum spanning tree
network whose metric distance is de?ned from absolute cross-correlation
coe?cients. We had investigated 69 daily ?nancial time series, which
constituted by 3 types ?nance assets (29 stock market indica- tor time series,
21 currency futures price time series and 19 commodity futures price time
series). Empirical analyses show that the MST network of returns is
time-dependent in overall structure, while same type ?nancial assets usually
keep stable inter-connections. Moreover each asset in same group show similar
economic characters. In other words, each group concerned with one kind of
traditional ?nancial commodity. In addition, we ?nd the time-lag between stock
market indicator volatility time series and EUA (EU allowances), WTI (West
Texas Intermediate) volatility time series. The peak of cross-correlation
function of volatility time series between EUA (or WTI) and stock market
indicators show a signi?cant time shift (> 20days) from 0.
We analyze a controlled price formation experiment in the laboratory that
shows evidence for bubbles. We calibrate two models that demonstrate with high
statistical significance that these laboratory bubbles have a tendency to grow
faster than exponential due to positive feedback. We show that the positive
feedback operates by traders continuously upgrading their over-optimistic
expectations of future returns based on past prices rather than on realized
returns.
Predicting X from Twitter is a popular fad within the Twitter research
subculture. It seems both appealing and relatively easy. Among such kind of
studies, electoral prediction is maybe the most attractive, and at this moment
there is a growing body of literature on such a topic. This is not only an
interesting research problem but, above all, it is extremely difficult.
However, most of the authors seem to be more interested in claiming positive
results than in providing sound and reproducible methods. It is also especially
worrisome that many recent papers seem to only acknowledge those studies
supporting the idea of Twitter predicting elections, instead of conducting a
balanced literature review showing both sides of the matter. After reading many
of such papers I have decided to write such a survey myself. Hence, in this
paper, every study relevant to the matter of electoral prediction using social
media is commented. From this review it can be concluded that the predictive
power of Twitter regarding elections has been greatly exaggerated, and that
hard research problems still lie ahead.
The timing patterns of human communication in social networks is not random.
On the contrary, communication is dominated by emergent statistical laws such
as non-trivial correlations and clustering. Recently, we found long-term
correlations in the user's activity in social communities. Here, we extend this
work to study collective behavior of the whole community. The goal is to
understand the origin of clustering and long-term persistence. At the
individual level, we find that the correlations in activity are a byproduct of
the clustering expressed in the power-law distribution of inter-event times of
single users. On the contrary, the activity of the whole community presents
long-term correlations that are a true emergent property of the system, i.e.
they are not related to the distribution of inter-event times. This result
suggests the existence of collective behavior, possible arising from nontrivial
communication patterns through the embedding social network.
The potential approach is a general and simple method for modelling interest
rates, foreign exchange rates, and in principle other types of financial
assets. This paper takes data on some liquid interest rate derivatives, and
fits potential models using a small finite-state Markov chain as the base
Markov process.
We introduce a new threshold model of social networks, in which the nodes
influenced by their neighbours can adopt one out of several alternatives. We
characterize social networks for which adoption of a product by the whole
network is possible (respectively necessary) and the ones for which a unique
outcome is guaranteed. These characterizations directly yield polynomial time
algorithms that allow us to determine whether a given social network satisfies
one of the above properties.
<br />We also study algorithmic questions for networks without unique outcomes. We
show that the problem of determining whether a final network exists in which
all nodes adopted some product is NP-complete. In turn, the problems of
determining whether a given node adopts some (respectively, a given) product in
some (respectively, all) network(s) are either co-NP complete or can be solved
in polynomial time.
<br />Further, we show that the problem of computing the minimum possible spread of
a product is NP-hard to approximate with an approximation ratio better than
$\Omega(n)$, in contrast to the maximum spread, which is efficiently
computable. Finally, we clarify that some of the above problems can be solved
in polynomial time when there are only two products.
We study networks that display community structure -- groups of nodes within
which connections are unusually dense. Using methods from random matrix theory,
we calculate the spectra of such networks in the limit of large size, and hence
demonstrate the presence of a phase transition in matrix methods for community
detection, such as the popular modularity maximization method. The transition
separates a regime in which such methods successfully detect the community
structure from one in which the structure is present but is not detected. By
comparing these results with recent analyses of maximum-likelihood methods we
are able to show that spectral modularity maximization is an optimal detection
method in the sense that no other method will succeed in the regime where the
modularity method fails.
A theory of exceptional extreme events, characterized by their abnormal sizes
885
compared with the rest of the distribution, is presented. Such outliers, called
"dragon-kings", have been reported in the distribution of financial drawdowns,
city-size distributions (e.g., Paris in France and London in the UK), in
material failure, epileptic seizure intensities, and other systems. Within our
theory, the large outliers are interpreted as droplets of Bose-Einstein
condensate: the appearance of outliers is a natural consequence of the
occurrence of Bose-Einstein condensation controlled by the relative degree of
attraction, or utility, of the largest entities. For large populations, Zipf's
law is recovered (except for the dragon-king outliers). The theory thus
provides a parsimonious description of the possible coexistence of a power law
distribution of event sizes (Zipf's law) and dragon-king outliers.
Here, a scenario is proposed, according to which a generic self-organized
critical (SOC) system can be looked upon as a Witten-type topological field
theory (W-TFT) with spontaneously broken Becchi-Rouet-Stora-Tyutin (BRST)
symmetry. One of the conditions for the SOC is the slow driving noise, which
unambiguously suggests Stratonovich interpretation of the corresponding
stochastic differential equation (SDE). This, in turn, necessitates the use of
Parisi-Sourlas-Wu stochastic quantization procedure, which straightforwardly
leads to a model with BRST-exact action, i.e., to a W-TFT. In the parameter
space of the SDE, there must exist full-dimensional regions where the
BRST-symmetry is spontaneously broken by instantons, which in the context of
SOC are essentially avalanches. In these regions, the avalanche-type SOC
dynamics is liberated from overwise a rightful dynamics-less W-TFT, and a
Goldstone mode of Fadeev-Popov ghosts exists. Goldstinos represent modulii of
instantons (avalanches) and being gapless are responsible for the critical
avalanche distribution in the low-energy, long-wavelength limit. The above
arguments are robust against moderate variations of the SDE's parameters and
the criticality is "self-tuned". The proposition of this paper suggests that
the machinery of W-TFTs may find its applications in many different areas of
modern science studying various physical realizations of SOC. It also suggests
that there may in principle exist a connection between some of SOC's and the
concept of topological quantum computing.