Module: tf_agents.bandits.agents.neural_linucb_agent
Stay organized with collections
Save and categorize content based on your preferences.
Implements the Neural + LinUCB bandit algorithm.
Applies LinUCB on top of an encoding network.
Since LinUCB is a linear method, the encoding network is used to capture the
non-linear relationship between the context features and the expected rewards.
The encoding network may be already trained or not; if not trained, the
method can optionally train it using epsilon greedy.
Reference:
Carlos Riquelme, George Tucker, Jasper Snoek,
Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep
Networks for Thompson Sampling, ICLR 2018.
Classes
class NeuralLinUCBAgent: An agent implementing the LinUCB algorithm on top of a neural network.
class NeuralLinUCBVariableCollection: A collection of variables used by NeuralLinUCBAgent.
|
absolute_import
|
Instance of __future__._Feature
|
|
division
|
Instance of __future__._Feature
|
|
print_function
|
Instance of __future__._Feature
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[]]