The notes of Justin Abrahms

Recently updated

  • Twyman's Law

    Jan 04, 2025

    • Trustworthy Online Controlled Experiments

      Jan 04, 2025

      • book
    • Threats to internal validity in experiments

      Jan 04, 2025

          • Abstract Data Types
          • Accelerate
          • Adding ssh keys into gcp instance
          • Additive Rule of Probability
          • An ideal pipeline
          • An idealized developer day
          • An overview of python data science libraries
          • Analysis of Umatilla Reservation Report, 1868
          • Analysis of variance
          • anti-derivatives / anti-differentiation / integration
          • Apollo
          • attention (machine learning)
          • authentic participation for companies
          • automated canaries with spinnaker
          • Automated SLSA evaluation
          • availability
          • backstage
          • batch size
          • bayes' theorem
          • Behavioral Sciences Stats
          • Benford's Law
          • bernouli distribution
          • binomial distribution
          • Bitswap
          • blockchain
          • blocking
          • Bluesky
          • book review
          • box & whisker plot
          • Broken Treaties, OPB
          • Building a HFT system w/ go and java (coinbase)
          • Business Continuity / Disaster Recovery
          • Butt-conf
          • Calculating arc length and surface area
          • Calculating areas and volumes
          • calculus
          • Canary Rollouts
          • CAP theorem
          • Categorial Data Analysis
          • CBOR
          • CD maturity model
          • cdCon
          • cdcon 2022 trip report
          • cdcon keynotes
          • Central Limit Theorem
          • central tendency
          • certificate authority
          • Certified Kubernetes Application Developer (CKAD)
          • chainguard
          • Challenges in open, self-sovereign identity
          • Changelog
          • CHAOSS (Community Health Analytics for Open Source Software)
          • ChatGPT
          • Chebyshev's Inequality Theorem
          • chess
          • chi-square procedure
          • Clustering within org-roam notes
          • co-ops
          • coefficient of variation
          • coffee shuffles
          • commercial positioning of opensource software
          • Common Crawl
          • community analysis of open source
          • Community detection in graphs
          • Complements
          • conditional probability
          • Conferences
          • Confidence Intervals
          • configuration management
          • Consensus Clustering
          • consistent hashing
          • Consul
          • Consul leader election
          • Container Network Interface (CNI)
          • Content Addressible aRchive
          • Content Identifier
          • Continuing Education
          • continuous delivery
          • Continuous Delivery Foundation (CDF)
          • continuous integration
          • continuous random variables
          • continuous testing
          • correlation coefficients
          • Cory Doctorow
          • counting inbound org-roam links
          • Cramer's v
          • Critical Value
          • CS260 Data Structures
          • CSAM
          • Custom K8s Controller
          • Custom Resource Definition
          • Customer Interviews
          • DAG-CBOR
          • day1/day2 operations w/ cncf
          • Decentralized Autonomous Organization
          • Deep Learning
          • definite integrals
          • degree
          • degrees of freedom
          • derivatives / differentiation
          • determining sample size
          • DevOps Enterprise Journal, Spring 2021
          • devshell
          • DID
          • DID Document
          • DID placeholder
          • DID Resolution
          • Differential Privacy
          • Digital Minimalism
          • disaggregated storage
          • Discrete random variables
          • displacement
          • Distributed Hash Table
          • division by polynomials
          • DNSlink
          • Docker Logging
          • dummy-ns
          • Dunbar's number
          • dynamic provisioning of pods in k8s
          • Eastern Oregon University (EOU)
          • eBPF
          • Elementary Matrices
          • Emacs
          • Empirical Rule
          • End-to-End Encryption
          • engineering vs science in product development
          • Essential patterns for designing and implementing your operator
          • Ethereum
          • Ethereum Oracles
          • Etsy Search Algorithms
          • Evil32
          • Exporting org-roam to quartz
          • fake door experiment
          • Far from the Shallows: The Value of Deeper Incident Analysis
          • far-away tokens
          • Figuring out owners for github repos
          • FileCoin
          • Fission Labs
          • flow framework
          • Followship: a proposed model of incident organizations
          • FOSSy
          • frequency distribution
          • Fundamental theorem of calculus
          • Gathering customer feedback
          • Gaussian Elimination
          • geists
          • generating connections within org-roam
          • Generating math homework in latex from python
          • geometry
          • github pull request querying
          • Good alerts
          • Google Cloud Platform
          • GPT-3
          • Gradient-Boosted Decision Trees
          • GraphQL
          • GraphQL Schema Governance
          • GUAC
          • Hash Addressible Mapped Trie
          • Hierarchy of Supply Chain needs
          • Homogeneous Equations
          • How to take Prometheus Planet Scale
          • How we revolutionized developer experience with 3.5 platform engineers
          • Hypergeometric random variables
          • Hypothesis Testing
          • Hyrum's Law
          • incident archeology
          • independent-samples t-test
          • InnerSource
          • Integration by Parts
          • Integration via Pythagorian Identity
          • Inter-planetary File System
          • Inter-Planetary Name System
          • InterPlanetary Linked Data
          • Introductory Statistic
          • IPFS blocks
          • IPFS pinning
          • IPVM
          • iroh
          • Jellyfish
          • JSON-LD
          • k8s finalizers
          • k8s services
          • kanban
          • Key Sharding
          • Knight movement
          • knots
          • knowledge-base
          • KubeconNA 2022
          • KubeconNA 2024
          • kubernetes
          • kubo
          • kustomize
          • L'Hopital's Rule (LHR)
          • Learning through Writing
          • Legality of DAOs
          • libp2p
          • Linear Algebra
          • Linear Equations
          • linear interpolation
          • linear regression (stats)
          • Linear Transformations
          • Link Record
          • Linux Foundation
          • Little's Law
          • Localization
          • log-normal distribution
          • logarithms
          • London System
          • loosely coupled architecture
          • Machine Learning
          • Managing Diversity
          • Marketing is Essential for Open Source Projects
          • marko
          • Martin Fowler
          • mass-download of gitlab data
          • Math
          • Matrix Addition, Scalar Multiplication, and Transposition
          • Matrix Inverses
          • Matrix Multiplication
          • Matrix-Vector Multiplication
          • Maturity Model
          • Mean Value Theorem of Differentiation
          • measures of placement
          • Mesh Networking
          • multiplicative rule of probability
          • n choose k
          • Native Oregonians
          • Nix
          • Nomad machine migration
          • non-custodial wallets
          • Non-Fungible Token
          • Non-parametric statistics
          • Noosphere
          • Noosphere Authority
          • noosphere e2ee
          • Noosphere Gateway
          • Noosphere Identity
          • Noosphere key rotation
          • Noosphere Memo
          • Noosphere private content
          • Noosphere Sphere
          • Noosphere Structures
          • OAuth
          • Objective Key Results
          • Octave
          • ogive graph
          • one-way ANOVA
          • Open Feature
          • Open Feature Summit
          • Open Source
          • Open Source Summit
          • Open Telemetry
          • open telemetry and open tracing merge
          • Open Tracing
          • open tracing/telemetry in async use-cases
          • OpenSSF
          • orb
          • orb-ns
          • ordinal data
          • org roam
          • org-roam tags
          • org-roam-to-subtext
          • OSPO (TODO) group
          • OSPO at GitHub
          • OSPOcon
          • otel and cd pipelines
          • otel and tests
          • outreachy
          • overlay network
          • pareto chart
          • pareto principal
          • Partial Fractions
          • Pearson Residuals
          • Percentiles
          • perfect square
          • PERT estimation
          • Petnames
          • platform engineering
          • poisson distribution
          • policy gradient algorithms
          • post-incident analysis
          • Power (statistics)
          • power law distribution
          • practica
          • probability of random selection
          • Product Lead Growth
          • Projection Mapping
          • Projections and Planes
          • prometheus
          • Prometheus
          • Protocol Labs
          • protopia
          • Provenance of ML training data
          • Proximal Policy Optimization
          • PSU Data Science
          • psycological safety
          • QA
          • quality practices
          • Quartiles
          • questions to ask in an interview
          • R (programming language)
          • random variables
          • rands leadership slack
          • Range
          • Read Write Own
          • Recursive Neural Net
          • Regenerative Finance (ReFi)
          • Reinforcement Learning from Human Feedback (RLHF)
          • related-samples t-test
          • Rust
          • Rust macros
          • Rust on Nix
          • Sampling
          • Sampling Distributions
          • Sampling distributions of means
          • Sampling Error
          • SAST tooling
          • sbom scorecard
          • scale-free networks
          • scaling event streaming applications
          • scaling governance across the enterprise
          • Secure Quick Reliable Login
          • Secure Scuttlebutt
          • security
          • self-hosting orb
          • Self-sovereign identity
          • service account (k8s)
          • Service Levels
          • shift left on security
          • Simpson's Paradox
          • single sample inferences
          • SLSA
          • Software Bill of Materials (SBOM)
          • solarpunk
          • Solarpunk Manifesto
          • Solutions for Systems of Linear Equations & elementary operations
          • SPACE framework for developer productivity (forsgren)
          • Sprint planning
          • SRE
          • SREcon
          • SREcon 2023 trip report
          • Standard Deviation
          • Standard Error
          • standard error of the estimate
          • standard normal curve
          • STAT 361 Intro to Statistical Methods
          • statistics
          • statistics definitions
          • stats in python
          • Subconscious (mobile app)
          • Subconscious Networks
          • Subconscious Networks moderation
          • subconscious ops
          • subsidiarity
          • subtext
          • subtext transclusion
          • supervised machine learning
          • Supply Chain Security
          • Survey design for Accelerate
          • survivorship bias
          • Sustainability the container native way
          • synthetic testing (CXA)
          • t-distribution
          • t-tests
          • take rate
          • Team Topologies
          • tekton & supply chain security
          • Ten Principles of Self-Sovereign Identity
          • Test Driven Development
          • testing
          • tests for quartz
          • The Cofactor Expansion
          • Things I find myself repeating
          • Threats to external validity
          • Threats to internal validity in experiments
          • Thrive Market
          • tidyverse
          • transformational leadership
          • transformer network
          • Trigonometric Integrals
          • trigonometric substitution
          • Trigonometry
          • TRUE cost of open source
          • Trunk-based development
          • Trustworthy Online Controlled Experiments
          • two sample comparison
          • two-way ANOVA
          • Twyman's Law
          • u-substitution
          • unit of work in modern delivery
          • UnixFS
          • USDS
          • User controlled Authorization Networks (UCAN)
          • user story mapping
          • Using CHAOSS to measure risk in your portfolio
          • Utmost good faith
          • Vectors and Lines
          • verifiable data registry
          • Version Map
          • videos
          • walk in graphql
          • Walkaway
          • Web 3
          • Web-Native File System
          • web3.storage
          • Westrum orgnizational culture
          • Wilcoxon rank sum tests
          • wireshark
          • Workload Identity Federation
          • wow specifications
          • z-score
          • Zooko's Triangle
        Home

        ❯

        posts

        ❯

        policy gradient algorithms

        policy gradient algorithms

        Aug 25, 20241 min read

        • project
        1. sample actions
        2. observe rewards
        3. tweak the policy

        Sources

        https://towardsdatascience.com/policy-gradients-in-reinforcement-learning-explained-ecec7df94245


        Graph View

        Backlinks

        • Proximal Policy Optimization

        Created with Quartz v4.4.0 © 2025

        • GitHub
        • Email
        • bsky