


Остановите войну!
for scientists:


default search action
Nitish Shirish Keskar
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2022
- [c16]Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Nitish Shirish Keskar, Caiming Xiong:
Modeling Multi-hop Question Answering as Single Sequence Prediction. ACL (1) 2022: 974-990 - [i27]Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, Nitish Shirish Keskar, Caiming Xiong:
Modeling Multi-hop Question Answering as Single Sequence Prediction. CoRR abs/2205.09226 (2022) - [i26]Yongjun Chen, Jia Li, Zhiwei Liu, Nitish Shirish Keskar, Huan Wang, Julian J. McAuley, Caiming Xiong:
Generating Negative Samples for Sequential Recommendation. CoRR abs/2208.03645 (2022) - 2021
- [c15]Gustavo Aguilar, Bryan McCann, Tong Niu, Nazneen Rajani, Nitish Shirish Keskar, Thamar Solorio:
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality. EMNLP (Findings) 2021: 1640-1651 - [c14]Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq R. Joty, Richard Socher, Nazneen Fatema Rajani:
GeDi: Generative Discriminator Guided Sequence Generation. EMNLP (Findings) 2021: 4929-4952 - [c13]Tong Niu, Semih Yavuz, Yingbo Zhou, Nitish Shirish Keskar, Huan Wang, Caiming Xiong:
Unsupervised Paraphrasing with Pretrained Language Models. EMNLP (1) 2021: 5136-5150 - [c12]Sourya Basu, Govardana Sachitanandam Ramachandran, Nitish Shirish Keskar, Lav R. Varshney:
Mirostat: a Neural Text decoding Algorithm that directly controls perplexity. ICLR 2021 - [i25]Wenpeng Yin, Shelby Heinecke, Jia Li, Nitish Shirish Keskar, Michael Jones, Shouzhong Shi, Stanislav Georgiev, Kurt Milich, Joseph Esposito, Caiming Xiong:
Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution. CoRR abs/2111.10497 (2021) - 2020
- [c11]Huan Wang, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
Assessing Local Generalization Capability in Deep Models. AISTATS 2020: 2077-2087 - [c10]Semih Yavuz, Kazuma Hashimoto, Wenhao Liu, Nitish Shirish Keskar, Richard Socher, Caiming Xiong:
Simple Data Augmentation with the Mask Token Improves Domain Adaptation for Dialog Act Tagging. EMNLP (1) 2020: 5083-5089 - [c9]Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher:
The Thieves on Sesame Street are Polyglots - Extracting Multilingual Models from Monolingual APIs. EMNLP (1) 2020: 6203-6207 - [c8]Lav R. Varshney, Nitish Shirish Keskar, Richard Socher:
Limits of Detecting Text Generated by Large-Scale Language Models. ITA 2020: 1-5 - [i24]Lav R. Varshney, Nitish Shirish Keskar, Richard Socher:
Limits of Detecting Text Generated by Large-Scale Language Models. CoRR abs/2002.03438 (2020) - [i23]Isabela Albuquerque, Nikhil Naik, Junnan Li, Nitish Shirish Keskar, Richard Socher:
Improving out-of-distribution generalization via multi-task self-supervised pretraining. CoRR abs/2003.13525 (2020) - [i22]Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Raphael R. Eguchi, Po-Ssu Huang, Richard Socher:
ProGen: Language Modeling for Protein Generation. CoRR abs/2004.03497 (2020) - [i21]Sourya Basu, Govardana Sachitanandam Ramachandran, Nitish Shirish Keskar, Lav R. Varshney:
Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm. CoRR abs/2007.14966 (2020) - [i20]Ben Krause, Akhilesh Deepak Gotmare, Bryan McCann, Nitish Shirish Keskar, Shafiq R. Joty, Richard Socher, Nazneen Fatema Rajani:
GeDi: Generative Discriminator Guided Sequence Generation. CoRR abs/2009.06367 (2020) - [i19]Gustavo Aguilar, Bryan McCann, Tong Niu, Nazneen Fatema Rajani, Nitish Shirish Keskar, Thamar Solorio:
Char2Subword: Extending the Subword Embedding Space from Pre-trained Models Using Robust Character Compositionality. CoRR abs/2010.12730 (2020) - [i18]Tong Niu, Semih Yavuz, Yingbo Zhou, Huan Wang, Nitish Shirish Keskar, Caiming Xiong:
Unsupervised Paraphrase Generation via Dynamic Blocking. CoRR abs/2010.12885 (2020)
2010 – 2019
- 2019
- [j3]Nitish Shirish Keskar, Andreas Wächter:
A limited-memory quasi-Newton algorithm for bound-constrained non-smooth optimization. Optim. Methods Softw. 34(1): 150-171 (2019) - [j2]Albert S. Berahas
, Raghu Bollapragada
, Nitish Shirish Keskar
, Ermin Wei
:
Balancing Communication and Computation in Distributed Optimization. IEEE Trans. Autom. Control. 64(8): 3141-3155 (2019) - [c7]Wojciech Kryscinski, Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher:
Neural Text Summarization: A Critical Evaluation. EMNLP/IJCNLP (1) 2019: 540-551 - [c6]Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation. ICLR (Poster) 2019 - [c5]Victor Zhong, Caiming Xiong, Nitish Shirish Keskar, Richard Socher:
Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering. ICLR (Poster) 2019 - [i17]Victor Zhong, Caiming Xiong, Nitish Shirish Keskar, Richard Socher:
Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering. CoRR abs/1901.00603 (2019) - [i16]Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher:
Unifying Question Answering and Text Classification via Span Extraction. CoRR abs/1904.09286 (2019) - [i15]Jasdeep Singh, Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering. CoRR abs/1905.11471 (2019) - [i14]Wojciech Kryscinski, Nitish Shirish Keskar, Bryan McCann, Caiming Xiong, Richard Socher:
Neural Text Summarization: A Critical Evaluation. CoRR abs/1908.08960 (2019) - [i13]Lav R. Varshney, Nitish Shirish Keskar, Richard Socher:
Pretrained AI Models: Performativity, Mobility, and Change. CoRR abs/1909.03290 (2019) - [i12]Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong, Richard Socher:
CTRL: A Conditional Transformer Language Model for Controllable Generation. CoRR abs/1909.05858 (2019) - [i11]Ryan Theisen, Jason M. Klusowski, Huan Wang, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
Global Capacity Measures for Deep ReLU Networks via Path Sampling. CoRR abs/1910.10245 (2019) - 2018
- [c4]Stephen Merity, Nitish Shirish Keskar, Richard Socher:
Regularizing and Optimizing LSTM Language Models. ICLR (Poster) 2018 - [i10]Stephen Merity, Nitish Shirish Keskar, Richard Socher:
An Analysis of Neural Language Modeling at Multiple Scales. CoRR abs/1803.08240 (2018) - [i9]Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
Using Mode Connectivity for Loss Landscape Analysis. CoRR abs/1806.06977 (2018) - [i8]Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
The Natural Language Decathlon: Multitask Learning as Question Answering. CoRR abs/1806.08730 (2018) - [i7]Huan Wang, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
Identifying Generalization Properties in Neural Networks. CoRR abs/1809.07402 (2018) - [i6]Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher:
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation. CoRR abs/1810.13243 (2018) - 2017
- [c3]Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, Ping Tak Peter Tang:
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. ICLR 2017 - [i5]Stephen Merity, Nitish Shirish Keskar, Richard Socher:
Regularizing and Optimizing LSTM Language Models. CoRR abs/1708.02182 (2017) - [i4]Karim Ahmed, Nitish Shirish Keskar, Richard Socher:
Weighted Transformer Network for Machine Translation. CoRR abs/1711.02132 (2017) - [i3]Nitish Shirish Keskar, Richard Socher:
Improving Generalization Performance by Switching from Adam to SGD. CoRR abs/1712.07628 (2017) - 2016
- [j1]Nitish Shirish Keskar, Jorge Nocedal, Figen Öztoprak, Andreas Wächter:
A second-order method for convex l1-regularized optimization with active-set prediction. Optim. Methods Softw. 31(3): 605-621 (2016) - [c2]Nitish Shirish Keskar, Albert S. Berahas
:
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs. ECML/PKDD (1) 2016: 1-16 - [i2]Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, Ping Tak Peter Tang:
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. CoRR abs/1609.04836 (2016) - 2015
- [c1]Nitish Shirish Keskar, George Saon:
A nonmonotone learning rate strategy for SGD training of deep neural networks. ICASSP 2015: 4974-4978 - [i1]Nitish Shirish Keskar, Albert S. Berahas:
adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs. CoRR abs/1511.01169 (2015)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
load content from web.archive.org
Privacy notice: By enabling the option above, your browser will contact the API of web.archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2022-08-25 05:22 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint