Learning on Graphs Conference, 2022
Thank you for agreeing to serve as a reviewer for LoG 2022!
The Area Chair (AC) assigned to a paper should be your first point of contact for that paper. You can contact the AC by leaving a comment in OpenReview with the AC as a reader. (PCs will also be listed as readers, but will not be notified.)
If you encounter a situation that you are unable to resolve with your AC, please contact the Program Chairs at pcs@logconference.org. Please refrain from writing to the Program Chairs at their own email addresses.
Fulfilling your responsibilities as a reviewer in a high-quality and timely manner is critical to the success of the review process. Here is a list of key tasks for reviewers:
“Review the papers of others as you would wish your own to be reviewed”
A review aims to determine whether a submission will bring sufficient value to the community and contribute new knowledge. The process can be broken down into the following main reviewer tasks:
Reading the paper: Be sure to invest sufficient time to entirely understand the paper and look up related work that will help you evaluate it.
While reading, consider the following:
Answer three key questions for yourself to make a recommendation to Accept or Reject:
Write your initial review, organizing it as follows (below are example reviews):
General points to consider:
Engage in discussion: During the discussion phase, reviewers, authors, and Area Chairs engage in asynchronous discussion. Authors can revise their submissions to address concerns that arise. It is crucial that you actively engage and respond, i.e., you should be able to respond to comments/requests within 3 business days.
Provide final recommendation: Update your review, taking into account the new information collected during the discussion phase and any revisions to the submission. Maintain a spirit of openness to changing your initial recommendation (either to a more positive or more negative) rating.
Below is a description of the questions you will be asked on the review form for each paper and some guidelines on what to consider when answering these questions. Feel free to use the LoG paper checklist included in each paper as a tool when preparing your review (some submissions may have the checklist as part of the supplementary materials). Remember that answering “no” to some questions is typically not grounds for rejection. When writing your review, please keep in mind that after decisions have been made, reviews and meta-reviews of accepted papers and opted-in rejected papers will be made public.
Main Review: Write your review comments here. Be sure to include:
Overall Recommendation: Please provide an overall score for this submission.
Confidence: Please provide a score for your assessment of this submission to indicate how confident you are in your evaluation.
Ethical Flag: If there are ethical issues with this paper, please flag the paper for an ethics review. For guidance on when this is appropriate, please review the NeurIPS ethics guidelines.
During the review process you will be working with:
Please make sure to review the policies in the LoG 2022 Call for Papers.
You must keep everything relating to the review process confidential. Do not use ideas, code, or results from submissions in your own work until they become publicly available. Do not talk about or share submissions with anyone without prior approval from the Program Chairs. Code submitted for reviewing cannot be distributed or used for any other purpose.
The reviewing process is double blind at the level of reviewers and ACs (i.e., reviewers and ACs cannot see author identities) but not at the level of Program Chairs (Program Chairs can see identities of everyone). Authors are responsible for anonymizing their submissions (we recommend using https://anonymous.4open.science/ to anonymize GitHub repositories). Submissions may not contain any identifying information that may violate the double-blind reviewing policy. This policy applies to any supplementary or linked material as well, including code. If you are assigned a submission that is not adequately anonymized, please contact the corresponding AC. Please do not attempt to find out the identities of the authors for any of your assigned submissions (e.g., by searching on arXiv). This would constitute an active violation of the double-blind reviewing policy.
As a reminder, full-paper submissions are limited to 9 main pages and extended abstracts to 4 main pages. This includes all figures and tables. Additional pages containing references, an appendix, and the LoG 2022 paper checklist are allowed. In general, we were lenient with minor formatting violations (e.g., a spillover to page 10 or tables that are not in the LoG style), as long as these violations can be easily rectified in the final version. If you find violations that are not easily rectified without causing other presentation issues, please flag them to your AC. Some submissions may have included the LoG 2022 checklist into their supplementary material by mistake, so you may find the checklist there (to be viewed at your discretion).
For the full 9-page paper archival submissions track, LoG does not allow submissions that are identical or substantially similar to papers that are in submission to, have been accepted to, or have been published in other archival venues. Submissions that are identical or substantially similar to other LoG submissions fall under this policy as well; all LoG submissions should be distinct and sufficiently substantial. Slicing contributions too thinly is discouraged, and may fall under the dual submission policy. If you suspect that a submission that has been assigned to you is a dual submission or if you require further clarification, please contact the corresponding AC.
Below are two reviews, copied verbatim from previous ICLR conferences, that adhere well to our guidelines above: one for an “Accept” recommendation, and the other for a “Reject” recommendation. Note that while each review is formatted differently according to each reviewer’s style, both reviews are well-structured and therefore easy to navigate.
##########################################################################
Summary:
The paper provides an interesting direction in the meta-learning field. In particular, it proposes to enhance meta learning performance by fully exploring relations across multiple tasks. To capture such information, the authors develop a heterogeneity-aware meta-learning framework by introducing a novel architecture–meta-knowledge graph, which can dynamically find the most relevant structure for new tasks.
##########################################################################
Reasons for score:
Overall, I vote for accepting. I like the idea of mining the relation between tasks and handle it by the proposed meta-knowledge graph. My major concern is about the clarity of the paper and some additional ablation models (see cons below). Hopefully the authors can address my concern in the rebuttal period.
##########################################################################
Pros:
The paper takes one of the most important issues of meta-learning: task heterogeneity. For me, the problem itself is real and practical.
The proposed meta-knowledge graph is novel for capturing the relation between tasks and address the problem of task heterogeneity. Graph structure provides a more flexible way of modeling relations. The design for using the prototype-based relational graph to query the meta-knowledge graph is reasonable and interesting.
This paper provides comprehensive experiments, including both qualitative analysis and quantitative results, to show the effectiveness of the proposed framework. The newly constructed Art-Multi dataset further enhances the difficulty of tasks and makes the performance more convincing.
##########################################################################
Cons:
Although the proposed method provides several ablation studies, I still suggest the authors to conduct the following ablation studies to enhance the quality of the paper: (1) It might be valuable to investigate the modulation function. In the paper, the authors compare sigmoid, tanh, and Film layer. Can the authors analyze the results by reducing the number of gating parameters in Eq. 10 by sharing the gate value of each filter in Conv layers? (2) What is the performance of the proposed model by changing the type of aggregators?
For the autoencoder aggregator, it would be better to provide more details about it, which seems not very clear to me.
In the qualitative analysis (i.e., Figure 2 and Figure 3), the authors provide one visualization for each task. It would be more convincing if the authors can provide more cases in the rebuttal period.
##########################################################################
Questions during rebuttal period:
Please address and clarify the cons above
#########################################################################
Some typos:
(1) Table 7: I. no sample-level graph -> I. no prototype-based graph
(2) 5.1 Hyperparameter Settings: we try both sigmoid, tanh Film -> we try both sigmoid, tanh, Film.
(3) parameteric -> parametric
(4) Table 2: Origninal -> original
(5) Section 4 first paragraph: The enhanced prototype representation -> The enhanced prototype representations
Updates: Thanks for the authors’ response. The newly added experimental results address my concerns. I believe this paper will provide new insights for this field and I recommend this paper to be accepted.
Review: This paper proposes Recency Bias, an adaptive mini batch selection method for training deep neural networks. To select informative minibatches for training, the proposed method maintains a fixed size sliding window of past model predictions for each data sample. At a given iteration, samples which have highly inconsistent predictions within the sliding window are added to the minibatch. The main contribution of this paper is the introduction of a sliding window to remember past model predictions, as an improvement over the SOTA approach: Active Bias, which maintains a growing window of model predictions. Empirical studies are performed to show the superiority of Recency Bias over two SOTA approaches. Results are shown on the task of (1) image classification from scratch and (2) image classification by fine-tuning pretrained networks.
+ves:
Concerns:
The key concern about the paper is the lack of rigorous experimentation to study the usefulness of the proposed method. Despite the paper stating that there have been earlier work (Joseph et al., 2019 and Wang et al., 2019) that attempt mini-batch selection, the paper does not compare with them. This is limiting. Further, since the proposed method is not specific to the domain of images, evaluating it on tasks other than image classification, such as text classification for instance, would have helped validate its applicability across domains.
Considering the limited results, a deeper analysis of the proposed method would have been nice. The idea of a sliding window over a growing window is a generic one, and there have been many efforts to theoretically analyze active learning over the last two decades. How does the proposed method fit in there? (For example, how does the expected model variance change in this setting?) Some form of theoretical/analytical reasoning behind the effectiveness of recency bias (which is missing) would provide greater insights to the community and facilitate further research in this direction.
The claim of 20.5% reduction in test error mentioned in the abstract has not been clearly addressed and pointed out in the results section of the paper.
On the same note, the results are not conclusively in favor of the proposed method, and only is marginally better than the competitors. Why does online batch perform consistently than the proposed method? There is no discussion of these inferences from the results.
The results would have been more complete if results were shown in a setting where just recency bias is used without the use of the selection pressure parameter. In other words, an ablation study on the effect of the selection pressure parameter would have been very useful.
How important is the warm-up phase to the proposed method? Considering the paper states that this is required to get good estimates of the quantization index of the samples, some ablation studies on reducing/increasing the warm-up phase and showing the results would have been useful to understand this.
Fig 4: Why are there sharp dips periodically in all the graphs? What do these correspond to?
The intuition behind the method is described well, however, the proposed method would have been really solidified if it were analysed in the context of a simple machine learning problem (such as logistic regression). As an example, verifying if the chosen minibatch samples are actually close to the decision boundary of a model (even if the model is very simple) would have helped analyze the proposed method well.
Minor comments:
=====POST-REBUTTAL COMMENTS========
I thank the authors for the response and the efforts in the updated draft. Some of my queries were clarified. However, unfortunately, I still think more needs to be done to explain the consistency of the results and to study the generalizability of this work across datasets. I retain my original decision for these reasons.