👋 Welcome to Rong’Log
Die Aufgabe unseres Daseins ist, möglichst allseitig zu werden. Allseitig sein heißt aber nicht vieles wissen, sondern vieles lieben. — Jacob Burckhardt
Exercise 1.1 Assume in the one-period binomial market of Section 1.1 (The stock price $S$ has two possible outcomes in the next period: it can either go up to $uS_0$ or down to $dS_0$.) that both H and T have positive probability of occurring. Show that condition $0<d<1+r<u$ precludes arbitrage. In other words, show that if $X_0 = 0$ and $$X_1=\Delta_0 S_1 + (1+r)(X_0 - \Delta_0 S_0),$$ then we cannot have $X_1$ strictly positive with positive probability unless $X_1$ is strictly negative with positive probability as well, and this is the case regardless of the choice of the number $\Delta_0$....
Balancing Performance and Safety in RL Safe Reinforcement Learning (RL) is a subset of RL that focuses on learning policies that not only maximize the long-term reward but also ensure reasonable system performance and/or respect safety constraints during the learning and/or deployment processes [1]. Safety is the opposite of risk, which refers to the stochastic nature of the environment. An optimal policy for long-term reward maximization may still perform poorly in some catastrophic situations due to inherent uncertainty....
LSTM is a type of recurrent neural network that is widely used in natural language processing, speech recognition, and other applications where sequential data is important. LSTMs are particularly effective at capturing long-term dependencies in sequences of data, which can be challenging for other types of neural networks. Navigating the jargon associated with the components of LSTM networks can be daunting, even for those familiar with neural networks. Terms like “cell,” “layer,” “unit,” and “neuron” are often thrown around without a clear explanation of their meaning and purpose....
Reinforcement learning (RL) is a powerful framework for training agents to maximize cumulative reward, but it typically assumes risk-neutrality. This can lead to suboptimal behavior in practical scenarios where the consequences of unfavorable outcomes can be detrimental. What is risk? Generally, risk might arise whenever there is uncertainty. In a financial situation, investment risk can be identified with uncertain monetary loss. In a safety-critical engineering system, risk is the undesirable detrimental outcome....
Setting up an ML envronment can be a tricky thing. Here’s what worked for me on how to set up the environment and keep track of experiments. Project Setup Directory Structure Cookiecuttet Computational Environment Pycharm virtualenv pip install the packages: [1] Unzip the downloaded mjpro150 into ~/.mujoco/mjpro150, and place the mjkey.txt file at ~/.mujoco/mjkey.txt. [2] Run pip3 install -U 'mujoco-py<1.50.2,>=1.50.1' [3] Remove ~/.mujoco/mjpro150/bin/libglfw.3.dylib [4] Run brew install llvm boost hdf5 glfw...