Abstract | ||
---|---|---|
We consider a model for sequential online decision-making by many diverse agents. On each day, each agent makes a decision, and pays a penalty if it is a mistake. Obviously, it would be good for agents to avoid repeating the same mistakes made by other agents; however, difficulty may arise when some agents disagree over what constitutes a mistake, perhaps maliciously. As a metric of success for this problem, we consider dynamic regret, i.e., regret versus the off-line optimal sequence of decisions. Previous regret bounds usually use the much weaker notion of static regret, i.e., regret versus the best single decision in hindsight. We assume there is a set of "honest" players whose valuations for the decisions at each time step are identical. No assumptions are made about the remaining players, and the algorithm assumes no information about which are the honest players. We present an algorithm for this setting whose expected dynamic regret per honest player is optimal up to a multiplicative constant and an additive polylogarithmic term, assuming the number of options is bounded. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1145/1248377.1248431 | SPAA |
Keywords | Field | DocType |
diverse agent,additive polylogarithmic term,honest player,remaining player,online collaborative,static regret,expected dynamic regret,off-line optimal sequence,dynamic regret,previous regret,optimal dynamic regret,single decision,collaborative filtering,regret,multiagent systems | Multiplicative function,Computer science,Multi-agent system,Artificial intelligence,Valuation (finance),Distributed computing,Mathematical economics,Collaborative filtering,Mistake,Regret,Hindsight bias,Machine learning,Bounded function | Conference |
Citations | PageRank | References |
6 | 0.46 | 15 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Baruch Awerbuch | 1 | 4040 | 657.22 |
Thomas P. Hayes | 2 | 659 | 54.21 |