FuturICT foundations are social science, complex systems science, and ICT.
The main concerns and challenges in the science of complex systems in the
context of FuturICT are laid out in this paper with special emphasis on the
Complex Systems route to Social Sciences. This include complex systems having:
many heterogeneous interacting parts; multiple scales; complicated transition
laws; unexpected or unpredicted emergence; sensitive dependence on initial
conditions; path-dependent dynamics; networked hierarchical connectivities;
interaction of autonomous agents; self-organisation; non-equilibrium dynamics;
combinatorial explosion; adaptivity to changing environments; co-evolving
subsystems; ill-defined boundaries; and multilevel dynamics. In this context,
science is seen as the process of abstracting the dynamics of systems from
data. This presents many challenges including: data gathering by large-scale
experiment, participatory sensing and social computation, managing huge
distributed dynamic and heterogeneous databases; moving from data to dynamical
models, going beyond correlations to cause-effect relationships, understanding
the relationship between simple and comprehensive models with appropriate
choices of variables, ensemble modeling and data assimilation, modeling systems
of systems of systems with many levels between micro and macro; and formulating
new approaches to prediction, forecasting, and risk, especially in systems that
can reflect on and change their behaviour in response to predictions, and
systems whose apparently predictable behaviour is disrupted by apparently
unpredictable rare or extreme events. These challenges are part of the FuturICT
agenda.
The aim of this paper is twofold: to provide a theoretical framework and to
give further empirical support to Shiller's test of the appropriateness of
prices in the stock market based on the Cyclically Adjusted Price Earnings
(CAPE) ratio. We devote the first part of the paper to the empirical analysis
and we show that the CAPE is a powerful predictor of future long run
performances of the market not only for the U.S. but also for countries such as
Belgium, France, Germany, Japan, the Netherlands, Norway, Sweden and
Switzerland. We show four relevant empirical facts: i) the striking ability of
the logarithmic averaged earning over price ratio to predict returns of the
index, ii) how this evidence increases switching from returns to gross returns,
iii) moving over different time horizons, the regression coefficients are
constant in a statistically robust way, and iv) the poorness of the prediction
when the precursor is adjusted with long term interest rate. In the second part
we provide a theoretical justification of the empirical observations. Indeed we
propose a simple model of the price dynamics in which the return growth depends
on three components: a) a momentum component, naturally justified in terms of
agents' belief that expected returns are higher in bullish markets than in
bearish ones; b) a fundamental component proportional to the log earnings over
price ratio at time zero, from which the actual stock price may deviate as an
effect of random external disturbances, and c) a driving component ensuring the
diffusive behaviour of stock prices. Under these assumptions, we are able to
prove that, if we consider a sufficiently large number of periods, the expected
rate of return and the expected gross return are linear in the initial time
value of the log earnings over price ratio, and their variance goes to zero
with rate of convergence equal to minus one.
The aim of this article is to briefly review and make new studies of
correlations and co-movements of stocks, so as to understand the
"seasonalities" and market evolution. Using the intraday data of the CAC40, we
begin by reasserting the findings of Allez and Bouchaud [New J. Phys. 13,
025010 (2011)]: the average correlation between stocks increases throughout the
day. We then use multidimensional scaling (MDS) in generating maps and
visualizing the dynamic evolution of the stock market during the day. We do not
find any marked difference in the structure of the market during a day. Another
aim is to use daily data for MDS studies, and visualize or detect specific
sectors in a market and periods of crisis. We suggest that this type of
visualization may be used in identifying potential pairs of stocks for "pairs
trade".
The European sovereign debt crisis has impaired many European banks. The
distress on the European banks may transmit worldwide, and result in a
large-scale knock-on default of financial institutions. This study presents a
computer simulation model to analyze the risk of insolvency of banks and
defaults in a bank credit network. Simulation experiments reproduce the
knock-on default, and quantify the impact which is imposed on the number of
bank defaults by heterogeneity of the bank credit network, the equity capital
ratio of banks, and the capital surcharge on big banks.
The potential approach is a general and simple method for modelling interest
rates, foreign exchange rates, and in principle other types of financial
assets. This paper takes data on some liquid interest rate derivatives, and
fits potential models using a small finite-state Markov chain as the base
Markov process.
In this paper, we propose a simple randomized protocol for identifying
trusted nodes based on personalized trust in large scale distributed networks.
The problem of identifying trusted nodes, based on personalized trust, in a
large network setting stems from the huge computation and message overhead
involved in exhaustively calculating and propagating the trust estimates by the
remote nodes. However, in any practical scenario, nodes generally communicate
with a small subset of nodes and thus exhaustively estimating the trust of all
the nodes can lead to huge resource consumption. In contrast, our mechanism can
be tuned to locate a desired subset of trusted nodes, based on the allowable
overhead, with respect to a particular user. The mechanism is based on a simple
exchange of random walk messages and nodes counting the number of times they
are being hit by random walkers of nodes in their neighborhood. Simulation
results to analyze the effectiveness of the algorithm show that using the
proposed algorithm, nodes identify the top trusted nodes in the network with a
very high probability by exploring only around 45% of the total nodes, and in
turn generates nearly 90% less overhead as compared to an exhaustive trust
estimation mechanism, named TrustWebRank. Finally, we provide a measure of the
global trustworthiness of a node; simulation results indicate that the measures
generated using our mechanism differ by only around 0.6% as compared to
TrustWebRank.
We consider a system of diffusion processes that interact through their
empirical mean and have a stabilizing force acting on each of them,
corresponding to a bistable potential. There are three parameters that
characterize the system: the strength of the intrinsic stabilization, the
strength of the external random perturbations, and the degree of cooperation or
interaction between them. The latter is the rate of mean reversion of each
component to the empirical mean of the system. We interpret this model in the
context of systemic risk and analyze in detail the effect of cooperation
between the components, that is, the rate of mean reversion. We show that in a
certain regime of parameters increasing cooperation tends to increase the
stability of the individual agents but it also increases the overall or
systemic risk. We use the theory of large deviations of diffusions interacting
through their mean field.
We study the structure of inter-industry relationships using networks of
money flows between industries in 20 national economies. We find these networks
vary around a typical structure characterized by a Weibull link weight
distribution, exponential industry size distribution, and a common community
structure. The community structure is hierarchical, with the top level of the
hierarchy comprising five industry communities: food industries, chemical
industries, manufacturing industries, service industries, and extraction
industries.
We characterize the distributions of size and duration of avalanches
propagating in complex networks. By an avalanche we mean the sequence of events
initiated by the externally stimulated `excitation' of a network node, which
may, with some probability, then stimulate subsequent firings of the nodes to
which it is connected, resulting in a cascade of firings. This type of process
is relevant to a wide variety of situations, including neuroscience, cascading
failures on electrical power grids, and epidemology. We find that the
statistics of avalanches can be characterized in terms of the largest
eigenvalue and corresponding eigenvector of an appropriate adjacency matrix
which encodes the structure of the network. By using mean-field analyses,
previous studies of avalanches in networks have not considered the effect of
network structure on the distribution of size and duration of avalanches. Our
results apply to individual networks (rather than network ensembles) and
provide expressions for the distributions of size and duration of avalanches
starting at particular nodes in the network. These findings might find
application in the analysis of branching processes in networks, such as
cascading power grid failures and critical brain dynamics. In particular, our
results show that some experimental signatures of critical brain dynamics
(i.e., power-law distributions of size and duration of neuronal avalanches),
are robust to complex underlying network topologies.
Given a budget and arbitrary cost for selecting each node, the budgeted
influence maximization (BIM) problem concerns selecting a set of seed nodes to
disseminate some information that maximizes the total number of nodes
influenced (termed as influence spread) in social networks at a total cost no
more than the budget. Our proposed seed selection algorithm for the BIM problem
guarantees an approximation ratio of (1 - 1/sqrt(e)). The seed selection
algorithm needs to calculate the influence spread of candidate seed sets, which
is known to be #P-complex. Identifying the linkage between the computation of
marginal probabilities in Bayesian networks and the influence spread, we devise
efficient heuristic algorithms for the latter problem. Experiments using both
large-scale social networks and synthetically generated networks demonstrate
superior performance of the proposed algorithm with moderate computation costs.
Moreover, synthetic datasets allow us to vary the network parameters and gain
important insights on the impact of graph structures on the performance of
different algorithms.
In two previous papers the author developed a second-order price adjustment
(t\^atonnement) process. This paper extends the approach to include both
quantity and price adjustments. We demonstrate three results: a analogue to
physical energy, called "activity" arises naturally in the model, and is not
conserved in general; price and quantity trajectories must either end at a
local minimum of a scalar potential or circulate endlessly; and disturbances
into a subspace of substitutable commodities decay over time. From this we
argue, although we do not prove, that the model features global stability,
combined with local instability, a characteristic of many real markets.
Following these observations and a brief survey of empirical results for
price-setting and consumption behavior in markets for "real" goods (as opposed
to financial markets), we conjecture that Stigler and Becker's well-known
theory of consumer preference opens the possibility of substantial degeneracy
in commodity space, and therefore that price and quantity trajectories could
lie on a relatively low-dimensional subspace within the full commodity space.
A way to fight your traffic tickets. The paper was awarded a special prize of
$400 that the author did not have to pay to the state of California.
<br />In view of enormous, extremely surprising and completely unexpected public
interest to this work, we have added an appendix answering the two most common
questions.
A limit order book provides information on available limit order prices and
their volumes. Based on these quantities, we give an empirical result on the
relationship between the bid-ask liquidity balance and trade sign and we show
that liquidity balance on best bid/best ask is quite informative for predicting
the future market order's direction. Moreover, we define price jump as a sell
(buy) market order arrival which is executed at a price which is smaller
(larger) than the best bid (best ask) price at the moment just after the
precedent market order arrival. Features are then extracted related to limit
order volumes, limit order price gaps, market order information and limit order
event information. Logistic regression is applied to predict the price jump
from the limit order book's feature. LASSO logistic regression is introduced to
help us make variable selection from which we are capable to highlight the
importance of different features in predicting the future price jump. In order
to get rid of the intraday data seasonality, the analysis is based on two
separated datasets: morning dataset and afternoon dataset. Based on an analysis
on forty largest French stocks of CAC40, we find that trade sign and market
order size as well as the liquidity on the best bid (best ask) are consistently
informative for predicting the incoming price jump.
The practice of valuation by marking-to-market with current trading prices is
seriously flawed. Under leverage the problem is particularly dramatic: due to
the concave form of market impact, selling always initially causes the expected
leverage to increase. There is a critical leverage above which it is impossible
to exit a portfolio without leverage going to infinity and bankruptcy becoming
likely. Standard risk-management methods give no warning of this problem, which
easily occurs for aggressively leveraged positions in illiquid markets. We
propose an alternative accounting procedure based on the estimated market
impact of liquidation that removes the illusion of profit. This should curb the
leverage cycle and contribute to an enhanced stability of financial markets.
We consider the pricing of European-style structured credit payoff in a
static framework, where the underlying default times are independent given a
common factor. A practical application would consist of the pricing of
nth-to-default baskets under the Gaussian copula model (GCM). We provide
necessary and sufficient conditions so that the corresponding asset prices are
martingales and introduce the concept of "break-even" correlation matrix. When
no sudden jump-to-default events occur, we show that the perfect replication of
these payoffs under the GCM is obtained if and only if the underlying single
name credit spreads follow a particular family of dynamics. We calculate the
corresponding break-even correlations and we exhibit a class of Merton-style
models that are consistent with this result. We explain why the GCM does not
have a lot of competitors among the class of one-period static models, except
perhaps the Clayton copula.
Understanding how institutional changes within academia may affect the
overall potential of science requires a better quantitative representation of
how careers evolve over time. Since knowledge spillovers, cumulative advantage,
competition, and collaboration are distinctive features of the academic
profession, both the employment relationship and the procedures for assigning
recognition and allocating funding should be designed to account for these
factors. We study the annual production n_{i}(t) of a given scientist i by
analyzing longitudinal career data for 200 leading scientists and 100 assistant
professors from the physics community. We compare our results with 21,156
sports careers. Our empirical analysis of individual productivity dynamics
shows that (i) there are increasing returns for the top individuals within the
competitive cohort, and that (ii) the distribution of production growth is a
leptokurtic "tent-shaped" distribution that is remarkably symmetric. Our
methodology is general, and we speculate that similar features appear in other
disciplines where academic publication is essential and collaboration is a key
feature. We introduce a model of proportional growth which reproduces these two
observations, and additionally accounts for the significantly right-skewed
distributions of career longevity and achievement in science. Using this
theoretical model, we show that short-term contracts can amplify the effects of
competition and uncertainty making careers more vulnerable to early
termination, not necessarily due to lack of individual talent and persistence,
but because of random negative production shocks. We show that fluctuations in
scientific production are quantitatively related to a scientist's collaboration
radius and team efficiency.
Nowadays, networks are almost ubiquitous. In the past decade, community
detection received an increasing interest as a way to uncover the structure of
networks by grouping nodes into communities more densely connected internally
than externally. Yet most of the effective methods available do not consider
the potential levels of organisation, or scales, a network may encompass and
are therefore limited. In this paper we present a method compatible with global
and local criteria that enables fast multi-scale community detection. The
method is derived in two algorithms, one for each type of criterion, and
implemented with 6 known criteria. Uncovering communities at various scales is
a computationally expensive task. Therefore this work puts a strong emphasis on
the reduction of computational complexity. Some heuristics are introduced for
speed-up purposes. Experiments demonstrate the efficiency and accuracy of our
method with respect to each algorithm and criterion by testing them against
large generated multi-scale networks. This study also offers a comparison
between criteria and between the global and local approaches.
We study the dynamics of the Naming Game as an opinion formation model on
time-varying social networks. This agent-based model captures the essential
features of the agreement dynamics by means of a memory-based negotiation
process. Our study focuses on the impact of time-varying properties of the
social network of the agents on the Naming Game dynamics. We investigate the
outcomes of the dynamics on two different types of time-varying data - (i) the
networks vary across days and (ii) the networks vary within very short
intervals of time (20 seconds). In the first case, we find that networks with
strong community structure hinder the system from reaching global agreement;
the evolution of the Naming Game in these networks maintains clusters of
coexisting opinions indefinitely leading to metastability. In the second case,
we investigate the evolution of the Naming Game in perfect synchronization with
the time evolution of the underlying social network shedding new light on the
traditional emergent properties of the game that differ largely from what has
been reported in the existing literature
We introduce a future orientation index to quantify the degree to which Internet users worldwide seek more information about years in the future than years in the past. We analyse Google logs and find a striking correlation between the country’s GDP and the predisposition of its inhabitants to look forward.
We consider the class of short rate interest rate models for which the short
rate is proportional to the exponential of a Gaussian Markov process x(t) in
the terminal measure r(t) = a(t) exp(x(t)). These models include the Black,
Derman, Toy and Black, Karasinski models in the terminal measure. We show that
such interest rate models are equivalent with lattice gases with attractive
two-body interaction V(t1,t2)= -Cov(x(t1),x(t2)). We consider in some detail
the Black, Karasinski model with x(t) an Ornstein, Uhlenbeck process, and show
that it is similar with a lattice gas model considered by Kac and Helfand, with
attractive long-range two-body interactions V(x,y) = -\alpha (e^{-\gamma |x -
y|} - e^{-\gamma (x + y)}). An explicit solution for the model is given as a
sum over the states of the lattice gas, which is used to show that the model
has a phase transition similar to that found previously in the Black, Derman,
Toy model in the terminal measure.