A microeconomic model is developed, which accurately predicts the shape of
personal income distribution (PID) in the United States and the evolution of
the shape over time. The underlying concept is borrowed from geo-mechanics and
thus can be considered as mechanics of income distribution. The model allows
the resolution of empirical and definitional problems associated with personal
income measurements. It also serves as a firm fundament for definitions of
income inequality as secondary derivatives from personal income distribution.
It is found that in relative terms the PID in the US has not been changing
since 1947. Effectively, the Gini coefficient has been almost constant during
the last 60 years, as reported by the Census Bureau.
The process of collecting and organizing sets of observations represents a
common theme throughout the history of science. However, despite the ubiquity
of scientists measuring, recording, and analyzing the dynamics of different
processes, an extensive organization of scientific time-series data and
analysis methods has never been performed. Addressing this, annotated
collections of over 35 000 real-world and model-generated time series and over
9000 time-series analysis algorithms are analyzed in this work. We introduce
reduced representations of both time series, in terms of their properties
measured by diverse scientific methods, and of time-series analysis methods, in
terms of their behaviour on empirical time series, and use them to organize
these interdisciplinary resources. This new approach to comparing across
diverse scientific data and methods allows us to organize time-series datasets
automatically according to their properties, retrieve alternatives to
particular analysis methods developed in other scientific disciplines, and
automate the selection of useful methods for time-series classification and
regression tasks. The broad scientific utility of these tools is demonstrated
on datasets of electroencephalograms, self-affine time series, heart beat
intervals, speech signals, and others, in each case contributing novel analysis
techniques to the existing literature. Highly comparative techniques that
compare across an interdisciplinary literature can thus be used to guide more
focused research in time-series analysis for applications across the scientific
disciplines.
This editorial opens the special issues that the Journal of Statistical
Physics has dedicated to the growing field of statistical physics modeling of
social dynamics. The issues include contributions from physicists and social
scientists, with the goal of fostering a better communication between these two
communities.
Citation numbers and other quantities derived from bibliographic databases
are becoming standard tools for the assessment of productivity and impact of
research activities. Though widely used, still their statistical properties
have not been well established so far. This is especially true in the case of
bibliometric indicators aimed at the evaluation of individual scholars, because
large-scale data sets are typically difficult to be retrieved. Here, we take
advantage of a recently introduced large bibliographic data set, Google Scholar
Citations, which collects the entire publication record of individual scholars.
We analyze the scientific profile of more than 30,000 researchers, and study
the relation between the h-index, the number of publications and the number of
citations of individual scientists. While the number of publications of a
scientist has a rather weak relation with his/her h-index, we find that the
h-index of a scientist is strongly correlated with the number of citations that
she/he has received so that the number of citations can be effectively be used
as a proxy of the h-index. Allowing for the h-index to depend on both the
number of citations and the number of publications, we find only a minor
improvement.
The patterns of life exhibited by large populations have been described and
modeled both as a basic science exercise and for a range of applied goals such
as reducing automotive congestion, improving disaster response, and even
predicting the location of individuals. However, these studies previously had
limited access to conversation content, rendering changes in expression as a
function of movement invisible. In addition, they typically use the
communication between a mobile phone and its nearest antenna tower to infer
position, limiting the spatial resolution of the data to the geographical
region serviced by each cellphone tower. We use a collection of 37 million
geolocated tweets to characterize the movement patterns of 180,000 individuals,
taking advantage of several orders of magnitude of increased spatial accuracy
relative to previous work. Employing the recently developed sentiment analysis
instrument known as the \textit{hedonometer}, we characterize changes in word
usage as a function of movement, and find that expressed happiness increases
logarithmically with distance from an individual's average location.
We describe an agent-based simulation of a fictional (but feasible)
information trading business. The Gas Price Information Trader (GPIT) buys
information about real-time gas prices in a metropolitan area from drivers and
resells the information to drivers who need to refuel their vehicles.
<br />Our simulation uses real world geographic data, lifestyle-dependent driving
patterns and vehicle models to create an agent-based model of the drivers. We
use real world statistics of gas price fluctuation to create scenarios of
temporal and spatial distribution of gas prices. The price of the information
is determined on a case-by-case basis through a simple negotiation model. The
trader and the customers are adapting their negotiation strategies based on
their historical profits.
<br />We are interested in the general properties of the emerging information
market: the amount of realizable profit and its distribution between the trader
and customers, the business strategies necessary to keep the market operational
(such as promotional deals), the price elasticity of demand and the impact of
pricing strategies on the profit.
In this paper we complete and extend our previous work on stochastic control
applied to high frequency market-making with inventory constraints and
directional bets. Our new model admits several state variables (e.g. market
spread, stochastic volatility and intensities of market orders) provided the
full system is Markov. The solution of the corresponding HJB equation is exact
in the case of zero inventory risk. The inventory risk enters into play in two
ways: a path-dependent penalty based on the volatility and a penalty at expiry
based on the market spread. We perform perturbation methods on the inventory
risk parameter and obtain explicitly the solution and its controls up to first
order. We also include transaction costs; we show that the spread of the
market-maker is widened to compensate the transaction costs, but the expected
gain per traded spread remains constant. We perform several numerical
simulations to assess the effect of the parameters on the PNL, showing in
particular how the directional bet and the inventory risk change the shape of
the PNL density. Finally, we extend our results to the case of multi-aset
market-making strategies; we show that the correct notion of inventory risk is
the L2-norm of the (multi-dimensional) inventory with respect to the inventory
penalties.
We use data on wealth of the richest persons taken from the "rich lists"
provided by business magazines like Forbes to verify if upper tails of wealth
distributions follow, as often claimed, a power-law behaviour. The data sets
used cover the world's richest persons over 1996-2012, the richest Americans
over 1988-2012, the richest Chinese over 2006-2012 and the richest Russians
over 2004-2011. Using a recently introduced comprehensive empirical methodology
for detecting power laws, which allows for testing goodness of fit as well as
for comparing the power-law model with rival distributions, we find that a
power-law model is consistent with data only in 35% of the analysed data sets.
Moreover, even if wealth data are consistent with the power-law model, usually
they are also consistent with some rivals like the log-normal or stretched
exponential distributions.
Information theory provides ideas for conceptualising information and
measuring relationships between objects. It has found wide application in the
sciences, but economics and finance have made surprisingly little use of it. We
show that time series data can usefully be studied as information -- by noting
the relationship between statistical redundancy and dependence, we are able to
use the results of information theory to construct a test for joint dependence
of random variables. The test is in the same spirit of those developed by
Ryabko and Astola (2005, 2006b,a), but differs from these in that we add extra
randomness to the original stochatic process. It uses data compression to
estimate the entropy rate of a stochastic process, which allows it to measure
dependence among sets of random variables, as opposed to the existing
econometric literature that uses entropy and finds itself restricted to
pairwise tests of dependence. We show how serial dependence may be detected in
S&P500 and PSI20 stock returns over different sample periods and frequencies.
We apply the test to synthetic data to judge its ability to recover known
temporal dependence structures.
It is well known that the distribution of returns from various financial
instruments are leptokurtic, meaning that the distributions have "fatter tails"
than a Normal distribution, and have skew toward zero. This paper presents a
graceful micro-level explanation for such fat-tailed outcomes, using agents
whose private valuations have Normally-distributed errors, but whose utility
function includes a term for the percentage of others who also buy.